How do you build a self-driving car? OK, wait, that’s too loose a question. How do you build a safe self-driving car?
A part of the answer involves what some see as the miracle answer to every problem we have now: AI.
So… how do you make AI models safe? Earlier this year, we noted how you really can’t validate a machine-learned model. So if a self-driving car uses such models, then how can you show that the car is safe?
This is one of the elements of a discussion I had with AImotive, a company that develops software modules for use in autonomous vehicles. (And, just to avoid confusion, the second letter in their name is an upper-case I, not a lower-case L.) Their focus is safety for cars with automation levels L4 and L5 – the levels that require no driver (with L4 meaning automated under normal conditions and L5 meaning automated under all conditions).
A huge part of making a car drive itself involves replacing our eyeballs so that it can make good decisions. It’s something that’s easy for humans (well… perhaps maybe not always the good decisions part…), but quite the challenge for a machine.
I attended the Autonomous Vehicle Sensors Conference recently, and it could almost be called the Automotive Lidar Conference. There is clearly intense focus on lidar as a primary tool of perception, along with radar. According to AImotive, they are the priority for most industry development efforts, with vision – that is, actual cameras – serving to help with classification once the other two have identified something of interest.
AImotive takes the opposite approach: they also fuse lidar, radar, and vision, but they rely on vision first, turning to the other two for confirmation to make sure that their eyes aren’t playing tricks on them. That said, they don’t rely on AI as the only tool in the toolbox (although I won’t suggest that they are alone in this approach). They use both neural nets and classical algorithms as they see most appropriate.
While you would think that classical algorithms might be easier to validate than neural-net models; that may not be the case. But before we dig into why, let’s put some context around this. Validation of algorithms by AImotive takes both virtual and real-world forms, as illustrated by their development flow.
(Click to enlarge; image courtesy AImotive)
The virtual validation takes place first using their internal simulation platform, which they call aiSim. The next phases take place on a closed track, on public roads in Hungary (where they are based), and then on public roads elsewhere.
The aiSim platform – which they don’t offer for sale, since they feel that it would help their competition – provides an opportunity to exercise a control system through a huge number of real-world scenarios, setting up rapid evolution of algorithms. They liken it to a video game, where they can drive a virtual vehicle through a realistic virtual world and see whether the vehicle occupants arrive intact. This simulator is apparently a beast of a program.
The ultimate system architecture for their modules consists of four engines. They’re used for recognition, location, motion, and, ultimately, control. They call this their aiDrive platform. They say that these engines are both modular and customizable, allowing customers to start with them and then chart their own course.
But How Do You Know?
While aiSim is their first-line tool for early validation, it can’t be exhaustive on its own given the safety-critical application. So let’s start by reprising the theme that I’ve sounded before and touched on above: if you’re going for mission-critical – nay, life-critical – operations like controlling one car amongst a sea of cars in the limited space available and in compliance with myriad rules of the road, how do you prove that your AI models are what you say they are?
The assumption here, of course, is that classical algorithms can be proven; it’s just these newfangled machine-learned whippersnappers that are the problem. I have been basing that notion on the fact that, even if you simulate until you go blue and aren’t satisfied, you can always turn to formal analysis to prove safety definitively.
Why? Because formal lets you posit an outcome that you don’t want and then have the analysis do the drudge work of exploring all possible scenarios where that outcome might be true. If none are found, then you have positively, definitively proven that the scenario in question can’t happen.
If you think of all the possible bad outcomes and tick the “came up empty” box for each of them, then you know that your system is safe. Yes, it’s possible to miss an outcome or two, but it’s far easier to think of “all” the outcomes than it is to think of all the possible input scenarios that might cause a problem. Indeed, the whole point of formal is that it can highlight problematic input combinations that you would never think of, allowing you to protect against them.
So, with this thinking, all parts of an autonomous vehicle design should be provable – except for the pesky AI algorithms. But AImotive Chief Scientist Gergely Debreczeni made one of those observations that’s obvious once made, even if not so before.
Here’s the thing: we’re talking about vision processing. It could be still images or it could be video. There’s no practical way, even with formal analysis, to prove that all possible images will be correctly processed or, worse yet, that all sequences of images (i.e., video) will be correctly dealt with.
I suppose if you wanted to try to get down to the coding of each pixel, where you have a bounded set of possible values, and then go through all possible combinations and permutations of all values in all pixels in a still image, you might theoretically be able to cover the space – given a ridiculously enormous amount of compute power and time. That’s such a huge task as to be more of a gedanken experiment than anything practical. Stringing together multiple such images for a video would be an exponentially worse exercise.
But even then, how do you articulate the bad outcomes? Missing an edge? Interpreting a Rubin vase as two people kissing? Not seeing the child running into the street chasing a yellow or white or cerulean or vermillion ball? There is no bounded set of bad outcomes that you could identify even if you got the keys to an exascale computer for a (long?) while. (To be fair, I haven’t attempted to calculate what the true compute load would be…)
So what do we do now? Not only are we invoking unprovable AI algorithms – not even our classical algorithms are really provable. Of course, the practical answer is that you must satisfy whomever the safety guru is that’s overseeing the problem; that’s the bottom line. (If their last name happens to be Boole, that doesn’t make this a Boolean satisfiability question – formal still can’t work.) We’ll discount the possibility that a night on the town would provide that satisfaction… so… how does a well-meaning engineer do it right?
Mr. Debreczeni says that redundancy is the answer – as we’ve seen before. What matters now is not that the algorithms all perform perfectly, but that, if something goes wrong, you can detect it and place the car into a safe state (like making the windshield turn blue). It’s far easier to prove that element of the system than the algorithms, and it will likely make your evaluator much happier.
A Moving Target
In order to declare a new vehicle safe and suitable for sale, innumerable subcontractors several tiers below the automotive OEM must be able to prove their part and then somehow get that proof all the way up to the OEM through the supply chain. That’s presumably a fun project.
But what about updates? They also need to be proven safe. And now you need a process for funneling updates from those innumerable subcons back up to the OEM. As a driver – er – vehicle owner, you’re not likely to get a fuel-mixture algorithm update; you’re more likely to get a full update from the OEM, part of which will update the fuel-mixture algorithm.
Let’s say some subcon six layers down has the fuel-mixture update. How do they push it to the OEM through the four intervening layers? And what’s the urgency? If the OEM gets it, do they issue an immediate update, or do they collect a few subcon updates before doing their own update? Do updates happen on a schedule or just whenever?
So, even if a nifty update infrastructure is in place (which is still a work in progress), there’s the procedural aspect of making it work and making sure that, after each update, the vehicle is at least as safe (hopefully safer) than before the update.
Managing this update process up and down the supply chain is an open challenge that AImotive identified in our conversation. Yet another thing we never used to have to think about that will need solving before we can declare autonomy.
4 thoughts on “Will I Arrive Alive?”
What do you think of AImotive’s thoughts on algorithmic validation for self-driving cars?
I made the K.I.S.S comment out of frustration. Engineering is about setting clear goals and specifications that are verifiable, and designing to that. It’s NOT about picking some technology out of the hat, and applying it to a pet solution, and calling it good without any serious attempt to set clearly defined metrics for meeting expected goals. Your title was “Will I arrive alive?” Claims are the technology is safer than humans, so let’s actually set metrics to prove that. Verify sensor resolution at 200ft, 300ft as compared to human perception.
Set goals to definitively detect a small stature child (or adult) entering the roadway, in any position with varied choices of clothing/color, and varied road/background textures/colors, at the control and braking distance of the system at 45mph in poor weather and wet roads. Road test the system in hundreds/thousands of settings (locations, lighting, visual noise, objects, etc) up against real humans, for both positive and negative outcomes … IE real children and objects that might spoof as a child (or lack thereof) to the AI. Include not only basic detection, but also determine path/intent of the object/child.
Repeat for cars, cycles, trucks, rolling balls, trash cans, tumble weeds, plastic bags, debris, mud/water, tire casings, etc.
When every test concludes the AI does better than humans, have a team construct special cases to fool the AI and Humans. Concentrate on normal driving conditions … like directly into the rising/setting sun, blind corners, high speed side entry, animals entering a winding rural/mountain/forested road, etc.
This generally requires starting with sensor systems that have at least the perception of the human eye, and the ability to actually process those pixels.