feature article
Subscribe Now

Cars, Coding, and Carelessness

Sloppy Coding Practices Led to a Fatal Crash

You’ve probably heard by now about the lawsuit against Toyota regarding its electronic engine control. The jury found the automaker guilty of writing fatally sloppy code, and, based on what the software forensics experts found, I’d have to agree.

This case is fundamentally different from the “unintended acceleration” fiasco that embroiled a certain German carmaker back in 1986. That scare was entirely bogus and made-up, and it was fueled by an ill-considered “60 Minutes” exposé that aired in the days when Americans watched only three TV channels. Sales of the affected cars plummeted, and it took more than two decades for the company to recover. An engineering spokesman for the carmaker told reporters, “I’m not saying that we can’t find the problem with the cars. I’m saying there is no problem with the cars.” He was dead right – there was no problem with the cars – but the remark was viewed as arrogant hubris, and it just made the situation worse.

In reality, a few drivers had simply been pressing the wrong pedal, which is a surprisingly common mistake. It happens all the time, in all types of cars. Naturally, nobody wants to admit that they just ran over the family cat (or worse, their own child) through momentary stupidity, so they blame the equipment. “I didn’t run over Fluffy. The damn car did it!”

Back then, throttle controls were mechanical. There was a direct mechanical connection (usually a cable) from the gas pedal to the carburetors or fuel-injection system of the car. Unless gremlins got under the hood (no AMC jokes, please), there wasn’t much chance of that system going wrong.

Now cars’ throttles are mostly electronic, not mechanical, and the “drive by wire” system has come under new scrutiny. Unlike a basic steel cable, there are a whole lot of things that can go wrong between the sensor under the gas pedal and the actuator in the fuel injector. Any number of microcontrollers get their grubby mitts on that signal, or the connection itself could go bad. It’s just an embedded real-time system, after all, with all the pros and cons that that implies.

Reset to today. After a years-long legal battle involving an Oklahoma driver whose passenger was killed when their car suddenly accelerated when it wasn’t supposed to, the courts ruled in favor of the plaintiff. In other words, the car was defective and its maker, Toyota, was found to be liable.

There was no smoking gun in this case; no dramatically buggy subroutine that caused the fatal crash. Instead, there’s only supposition. But what a careful examination of the car’s firmware showed is that it could have failed in the way described in the case, not necessarily that it did fail. That was enough to convince the jury and penalize the carmaker at least $3 million.

For embedded programmers, the case was both enlightening and cautionary. For years, experts pored over Toyota’s firmware, and what they found was not comforting. Legal cases often bring out dirty laundry, the things we casually accept every day but would rather leave covered or private. In a liability case, privacy is not an option. Every single bit (literally) of Toyota’s code was scrutinized, along with the team’s programming practices. And the final conclusion was: they got sloppy.

It’s not that Toyota’s code was bad, necessarily. It just wasn’t very good. The software team repeatedly hacked their way around safety standards and ignored their own in-house rules. Yes, there were bugs – there will always be bugs. But is that okay in a safety-critical device? It’s nice for novices to say that there should never be bugs in such an important system; that we should never ship a product like a car or a pacemaker until it’s proven to be 100% bug-free. But, in reality, that means the product will never ship. Is that really what we want? If it’s going to be my car or pacemaker, yes. If it’s the car or pacemaker I’m designing… maybe that’s too high a bar. But there is some minimum level of quality and reliability that we as customers have a right to expect.

Toyota’s developers used MISRA-C and the OSEK operating system, both good choices for a safety-critical real-time system. But then they ignored, sidestepped, or circumvented many of the very safety features they are designed to enforce. For example, MISRA-C has 93 mandatory coding rules and 34 suggested rules; Toyota observed only 11 of those rules, and still violated five of them.  Oh, and they ignored error codes thrown by the operating system. You can’t trust a smoke alarm if you remove the battery every time it beeps.

Stack overflows got close scrutiny, because they’re the cause of many a malfunctioning system. Contrary to the developers’ claims that less than half of the allocated stack space was being used, the code analysis showed it was closer to 94%. That’s not a grievous failure in and of itself, but the developers wrote recursive code in direct violation of MISRA-C rules, and recursion, of course, eats stack space. To make matters worse, the Renesas V850 microcontroller they used has no MMU, and thus no hardware mechanism to trap or contain stack overflows. 

OSEK is common in automotive systems, almost a de facto standard. It’s portable, it’s widely available, and it’s designed to work on a variety of processors, including ones without an MMU. But because it’s a safety-critical software component, each OSEK implementation must be certified. How else can you tell a good and compliant OSEK implementation from a bad one? Toyota used a bad one. Or, at least, an uncertified one.

Structured-programming aficionados will cringe to learn that Toyota’s engine-control code had more than 11,000 global variables. Eleven thousand. Code analysis also revealed a rat’s nest of complex, untestable, and unmaintainable functions. On a cyclomatic-complexity scale, a rating of 10 is considered workable code, with 15 being the upper limit for some exceptional cases. Toyota’s code had dozens upon dozens of functions that rated higher than 50. Tellingly, the throttle-angle sensor function scored more than 100, making it completely and utterly untestable.

Although the Toyota system technically had watchdog timers, they were trivially simple fail-safes in name only. The list goes on and on, but it’s a familiar litany for anyone working in software development. We know better, we’re embarrassed by it, but we do it anyway. Right up until we get caught, and Toyota’s programmers got caught. And people died.

All the basics were there. As far as the legal and code experts could determine, the engine-control system would have worked if more of the safety, reliability, and code-quality features had been observed. And, obviously, the car does work most of the time. It’s not noticeably faulty code. And that’s the problem: it appears to work, even after millions of hours of real-world testing. But those lurking bugs are always there, allowed to creep in through cavalier attitudes about code hygiene, software rules, standards, and testing. Other, more conscientious developers did the hard work of creating MISRA-C, OSEK, and good coding practices. All we have to do is actually follow the rules. 

7 thoughts on “Cars, Coding, and Carelessness”

  1. “I think you will find it is more complex than that”
    Interestingly at least two of the accidents with Toyota involved very senior drivers and it could not be proved that they had not stamped on the wrong pedal.
    The code was sloppy but the expert witness was not able to prove that it caused the accident- just that it might have done
    Toyota appears to have taken an economic decision to pay up rather than appeal.
    Are courts the right venue for disentangling these events? I don’t think so, and tomorrow will be discussing his – so watch this space.

  2. Pingback: GVK Bioscience
  3. Pingback: DMPK Studies

Leave a Reply

featured blogs
Aug 1, 2021
https://youtu.be/I0AYf5V_irg Made in Long Ridge Open Space Preserve (camera Carey Guo) Monday: HOT CHIPS 2021 Preview Tuesday: Designed with Cadence Video Series Wednesday: July Update Thursday:... [[ Click on the title to access the full blog on the Cadence Community site. ...
Jul 30, 2021
You can't attack what you can't see, and cloaking technology for devices on Ethernet LANs is merely one of many protection layers implemented in Q-Net Security's Q-Box to protect networked devices and transaction between these devices from cyberattacks. Other security technol...
Jul 29, 2021
Learn why SoC emulation is the next frontier for power system optimization, helping chip designers shift power verification left in the SoC design flow. The post Why Wait Days for Results? The Next Frontier for Power Verification appeared first on From Silicon To Software....
Jul 28, 2021
Here's a sticky problem. What if the entire Earth was instantaneously replaced with an equal volume of closely packed, but uncompressed blueberries?...

featured video

Vibrant Super Resolution (SR-GAN) with DesignWare ARC EV Processor IP

Sponsored by Synopsys

Super resolution constructs high-res images from low-res. Neural networks like SR-GAN can generate missing data to achieve impressive results. This demo shows SR-GAN running on ARC EV processor IP from Synopsys to generate beautiful images.

Click here for more information about DesignWare ARC EV Processors for Embedded Vision

featured paper

Hyperconnectivity and You: A Roadmap for the Consumer Experience

Sponsored by Cadence Design Systems

Will people’s views about hyperconnectivity and hyperscale computing affect requirements for your next system or IC design? Download the latest Cadence report for how consumers view hyperscale computing’s impact on cars, mobile devices, and health.

Click to read more

featured chalk talk

Complete Packaging for IIoT Devices

Sponsored by Mouser Electronics and Phoenix Contact

Industrial Internet of Things (IIoT) design brings a new level of demands to the engineering team, particularly in areas like thermal performance, reliability, and scalability. And, packaging has a key role to play. In this episode of Chalk Talk, Amelia Dalton chats with Joel Boone of Phoenix Contact about challenges and solutions in IIoT design packaging.

Click here for more information about Phoenix Contact ICS 50 Enclosure System