It’s Time to Rethink Levels of Automation for Self-Driving Vehicles
In 2017, Waymo CEO John Krafcik told tech conferences that “Fully self-driving cars are here.” The next year, however, at a Wall Street Journal panel with the title “Are we there yet?” Krafcik said that so-called “Level Five” self-driving cars were impossible:
“I’m not sure that we’re ever… going to achieve an L5 level of automation… I think it’s sort of silly that we think about it. And it’s important I think for all of us to be really clear on the language around self-driving because it does end up confusing people… autonomy I think is always going to have some constraint on it” [1].
Survey evidence suggests that consumers are indeed confused about whether they can currently buy a vehicle that is “self-driving” [2]. This muddle is not because the public are ignorant. It is because one of the major ways in which the development of self-driving cars has been discussed — the levels of automation drawn up by the Society of Automotive Engineers (SAE) — is misleading. A typology originally developed to provide some engineering clarity now benefits technology developers far more than it serves the public interest. We are social researchers who have over the last three years worked with and interviewed the developers of this technology. We argue that the levels of automation need a rethink. The SAE levels, by emphasizing autonomy and implying that progress means more autonomy, do little to inform public decision-making about the conditions in which these technologies might have meaningful benefits.
Self-driving cars could be a transformative technology in both good and bad ways. The important questions are not to do with when they will arrive but where, for whom, and in what forms they will appear. If we want a clearer sense of the possibilities from automated vehicle systems, we need to broaden our gaze [3]. Rather than emphasizing the autonomy of self-driving vehicles, we should instead be talking about their conditionality. We need to know about the circumstances in which different systems could have an impact on our lives. Self-driving vehicle systems will serve different purposes and take on different shapes in different places. A schema for innovation that points in one direction and says nothing about the desirability of the destination makes for a poor roadmap.
Where Did the SAE Levels Come From?
The taxonomy offered by SAE has provided a much-needed language by which to describe and compare the automation of driving. The six-rung ladder (Figure 1), ranging from Level Zero (no automation) to Level Five (full, unconditional automation) has enabled engineers to think about the technical differences between systems. However, as this terminology has entered public and policy discourse, it has served to reinforce some myths of autonomy: that automation increases linearly, directly displaces human work, and will continue until automation is total and humans are completely eliminated from the system [4]. The levels of automation have been treated as waymarks along that seemingly self-evident trajectory. It has become commonplace to describe a self-driving vehicle system as Level something, without further specificity. The Waymo Chrysler Pacificas on public roads in Arizona and the low-speed Westfield automated shuttles operating with dedicated infrastructure away from public interference at London’s Heathrow Airport are both described as Level Four, but they have little in common. We need new ways to characterize such systems. But first, a bit of history: In his account of the battle for automobile safety in the Twentieth Century, historian Lee Vinsel describes the SAE levels as an attempt at standardization [5]. The J3106 “Taxonomy and Definitions for Terms Related to Driving Automation Systems for On-Road Motor Vehicles” was first published by the Society of Automotive Engineers in 2014. By the time SAE published its standard, there were already two competing frameworks for increasingly automated vehicles. In 2013, the National Highway Traffic Safety Administration had drawn up five levels for self-driving vehicles. NHTSA had called Level Zero “no automation” and Level Four “full automation,” and focused primarily on the role of the human operator in carrying out “safety-critical control functions” [6]. An expert group of the German Federal Highway Research Institute (BASt) had in 2013 described its own levels for automated driving (as opposed to automated vehicles), from “driver only” to “full automation,” with the latter involving an automated detection of “system limits” and return to a minimal risk condition. In other words, there was no limitless “full autonomy,” in which the system could operate competently in any environment such as a licensed human driver can [7]. The members of the SAE’s task force, after reviewing the BASt document chose to largely adopt the BASt formulation with a few key changes [8]. The SAE made two interventions. First, they added a sixth level, “Level Five,” above BASt’s “Level Four.” SAE called their Level Four “High Automation,” and added” Full Automation” above it to disambiguate the conditions under which such a vehicle could operate. This also had the impact of dividing NHTSA’s Level Four, which described a vehicle capable of carrying out all safety-critical tasks without oversight, in two. In an SAE Level Four system, the entire driving task is carried out autonomously, but with some limitations on the environment in which the vehicle is expected to operate. These limitations are not specified by the taxonomy, but include such things as geofencing, controlled infrastructure, or even weather-dependent operation. A Level Five system by contrast is expected to perform under “all roadway and environmental conditions that can be managed by a human driver” [9]. Second, the SAE added more definitions and explanatory content. They were careful to explain that it is the task of driving, not the vehicle itself, that is being automated. They applied a precise definition of the driving task, which involved longitudinal and lateral control functions, as well as monitoring of the environment and fallback performance, although they equivocated on the “minimal risk condition” to which systems would be expected to retreat in the event of failure. The SAE approach gave the levels a technologically-centered, and less ambiguous, set of descriptions than NHTSA had provided. The SAE levels have not been a static document, though the overall structure of the levels has not changed substantially. Through two revisions, in 2016 and 2018, the document has tripled in length, integrated more descriptive examples, especially around ambiguities and edge cases in the levels, and switched to increasingly specific, technical language to describe all aspects of the taxonomy. “System capability,” which described which “driving modes” could be handled by a system at a given level, has been replaced with “Operational Design Domain” (ODD), but otherwise responds to the same question: under what conditions can the system operate?As the terminology of automation has entered public and policy discourse, it has reinforced some myths of autonomy.
Looking further back, frameworks for levels of automation in transport did not begin with NHTSA, SAE, or BASt. Some of the first examples of the modern, numbered levels format come from the late 1990s, when Endsley and Kaber developed a ten-level model for automation roles that was intended “to be applicable to a wide array of dynamic process and automated system control domains, specifically advanced manufacturing, teleoperations, air traffic control, and aircraft piloting” [10]. The authors drew on earlier attempts, such as Thomas Sheridan’s work defining roles of humans and machines descriptively, rather than numerically. Sheridan proposed that machines extend, relieve, back-up, or replace; and human supervisors trust, command, plan, monitor, and intervene [11]. The original role of Endley’s numerical taxonomy was to provide an ordering between more human control on one side, and more computer control on the other. It was not intended to encourage others to aim only for high automation, but to think about the options for appropriate partitioning of tasks to achieve a more reliable and functional joint-human-machine system. In a recent commentary, Sheridan has clarified “the difficulty, even impossibility, of making level-of-automation taxonomies into readily useful tools for system design” [12].Implying that progress means more autonomy does little to inform public decision-making about how automation provides meaningful benefits.
What Are the Current Problems with SAE Levels?
As the historian Lee Vinsel argues, standards are not just ways of classifying things. They are also attempts to shape the technological future [13]. Thinking about how the structure of our standards contributes to their use is therefore crucial for making better policy decisions. The SAE’s standard levels formulation has a number of major weaknesses:Thinking about how the structure of standards contributes to their use is therefore crucial for making better policy decisions.
- The levels’ structure supports myths of autonomy: that automation increases linearly, directly displaces human work, and that more automation is better.
- The levels do not adequately address possibilities for human-machine cooperation.
- The levels specifically avoid discussion of environment, infrastructure, and contexts of use, which are critical for the social impacts of automation.
- The levels thus also invite misuse, wherein whole systems are labeled with a level that only applies to part of their operation, or potential future operation.
Furthermore, the SAE levels are sometimes deployed to refer to types of software, types of journey, responsibilities of drivers, parts of journeys or particular test conditions as well as the driving task. The dominant use, however, is to define a type of vehicle system, rather than a particular state or operational mode in a particular context. In discussions with people working in the industry, vehicles are often described as “Level Four” when they are capable of operating this way under certain sets of conditions, but might also operate at other levels of automation under different conditions. Currently, so-called “Level Four” vehicles are often tested with a safety driver who maintains actual responsibility for the system. Vehicles advertised as “Level Four” are often not “eyes off,” as the SAE category would suggest. So when companies talk about achieving “full autonomy,” the SAE levels offer little help in holding them to account for what that promise actually means. To address this common misuse the 2016 version of J3016 added that the levels are mutually-exclusive by “feature.” Each automated feature has only one designation; but an entire “system” may have features that operate at different levels. This theoretically clean distinction between vehicles, systems, and features is however often blurred in practice. And it does not address the miscategorization of many experimental vehicles: these vehicles may be aimed at Level Four driving in the future, but they can be no higher than Level Three in practice, by SAE definition, if a backup driver is necessary to maintain safe operation.When companies promise “full autonomy,” the SAE levels offer little help in holding them to account for what the promise means.
Moving from Levels of Automation to Conditions for Operation
First, new typologies should start with the recognition that, if they come from authoritative sources, they are devices for communication as well as analysis. There is therefore a responsibility to publicly clarify the limits as well as the possibilities of particular systems. This demands a greater focus on the operational design domain and varied options for human-machine collaboration and interaction, and a downplaying of autonomy and the direct replacement of human beings with machines. To the extent that the driving task is a focus of a new framework, it must be open to new arrangements of shared human-machine control and collaboration. And developers should avoid using a numbered levels structure that implicitly orders the categories in terms of difficulty and value. But if AVs are going to change the world, we also need to know more about technologies’ relationships with their contexts. Policymakers and the public need clearer information about the conditions in which particular automated devices can operate, and the additional changes that might be required in order for such systems to be safe, equitable, and effective. This means less focus on the “driving task” and more attention to place, infrastructure, and road rules. On infrastructure, we might look to the Infrastructure Support levels for Automated Driving (ISAD) recently proposed by the INFRAMIX project [24], which seeks to categorize parts of roads according to their connectivity. To this, we could add consideration of physical as well as digital infrastructures. Material arrangements of roads and road furniture are harder to standardize than digital systems, and vary more from place to place. In addition to varying road types, places are defined by various cultures and patterns of road use, which might make some AV systems inadequate or wholly inappropriate. Social factors on the roadway, neglected by the levels today, are therefore of similar importance to physical and digital infrastructure. The SAE levels have little to say about the behavior of other road users, but AV developers are now starting to admit that the success of their systems may depend upon more predictable patterns of behavior from cyclists, pedestrians, and others [25]. New typologies for AVs should therefore be explicit about what else is required for the systems to function as designed. The real benefits of “autonomous” vehicles will come when they are embedded in and able to work with whole systems, including other road users and physical and digital infrastructures. We need ways to evaluate such systems, and ask old but important technology assessment questions: Who pays? Who benefits? Who decides? Finally, policymakers need clearer ways to talk about technologies being tested and technologies being deployed. The widespread use of safety drivers as a fallback for prototype self-driving cars may be necessary for their safe development, but it means that testing for Level Four is happening, in effect, at Level Three. This means that these vehicles potentially come with all of the hazards of mixed-mode operation including mode confusion, automation complacency and problems of handovers. Notional future readiness for Level Four operation should not be allowed to confuse or distract from the real problems and risks that arise in testing and development, especially since the levels do not represent linear increases in capability. New typologies should aim for clarity about the conditions for testing as well as the conditions of use, and ensure that developers are being realistic about what will be necessary for their systems to be safe and effective in reaching stated goals. The SAE levels have served their purpose, but they now look inadequate to the task of informing future discussions. The SAE levels are directing innovation towards greater autonomy, which could miss some larger opportunities. The focus of automation discussions needs to turn outward, away from the narrow technical capabilities of a system measured against a known human task, and toward the environments and conditions that can make safer, fairer, more accessible mobility achievable. Retrieved from here.An innovation schema pointing one way, but saying nothing about the desirability of the destination, makes a poor roadmap.