feature article
Subscribe Now

Ignore Those Pesky Bugs

Software is Complicated, But How Much of it is Useful?

“We also know there are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns – the ones we don’t know we don’t know.” – Donald Rumsfeld

Consider the humble ladder. It’s a hardware device that’s elegant in its simplicity. Two parallel side rails, with evenly spaced rungs in between. Everything you need; nothing you don’t. Ladders can be made of wood, metal, Fiberglas, or other materials. There are long ones, short ones, portable ones, and permanent ones. Nobody really needs to be taught how to use a ladder, although there are some standard safety rules that might make your tenure at the top a bit more secure. 

Now consider your computer’s word-processing software. Or the email application on your smartphone. Not so simple, are they? In fact, there are a lot of features in most programs that seem to be superfluous. Simple, elegant, just-the-basics apps seem to be the exception. They get good reviews, but people remark on them like they’re some sort of aberration. Maybe they are. “Creeping featurism” is a term that’s almost as old as engineering itself.

But that’s not my point. We don’t need to flog the poor software galley slaves chained to their oars below decks. (That’s what managers are for.) No, the question for today is, “How much of your code does real work, as opposed to just catching errors?”

First-time programmers just starting out learn to create “hello, world!” or something similar. They get the feel for what it’s like to use a programming language to describe what they want and make it into compile-worthy code. At first, they work toward getting their early programs to obey their wishes, pure and simple. Bugs will creep in, sure, but that’s all part of the learning process.

But after the first few successful attempts, we start to learn the other side of programing: the part where you shore up the program to prevent it from failing in the real world. You start to build the safety net, the guard code. You’re no longer creating a program that does what you want. You’re creating additional code that prevents it from doing what you don’t want, and that’s a different process and a different mindset altogether.

What does the program do if the user accidentally types in a bogus date? What happens if it receives a malformed network packet? How does it behave if a pointer is out of bounds? These are all aspects of the “guard code” that we all have to include, even though it doesn’t add anything to the program and it doesn’t (usually) do any useful work. Oftentimes, it’s never called or executed at all. Guard code is there just to keep the program from tipping over in case something stupid happens.

Even though ladders are pretty safe, they don’t have safety features, per se. There’s no airbag at the bottom that deploys if you fall. There are (usually) no outriggers to prevent tip-overs. Ladders don’t have built-in current sensors to prevent you from using an aluminum ladder on power lines. There are no accelerometers or klaxons to alert you to unstable working angles. I once saw a ladder with a built-in bubble level to help you eyeball the slope, but that’s about it.

Programming isn’t like that. We actually spend a lot of our time adding safety features to our software. It’s like training wheels on a bicycle, except that they never come off. The guard code is always there, ready to catch that malformed packet or that bogus date, even though it may never happen.

On top of all that, we also have to guard against malicious intent, not just dumb mistakes. What if someone deliberately tries to break our program by shrewdly exploiting some weakness in the input buffer? You’ve got to guard against attacks, not just bugs. 

And we have to add security features. It’s harder than ever to make software hacker-proof, because the hackers keep getting wilier and craftier. There are accidental bugs, and then there are malicious assaults, and we generally can’t catch them both with the same kind of guard code. You have to consciously look for, and trap, both types: the known unknowns and the unknown unknowns.

So how much of today’s code falls into that “guard code” category, versus the amount that does the real work implementing the program’s putative purpose? Any guesses?

I would take a SWAG that guard code accounts for 75 percent of most modern programs. It’s got to be at least half. Looking at big chunks of open-source code, I see an awful lot of source code that’s there just to trap errors, mistakes, user flubs, and similar non-malicious bugs. It’s sometimes hard to see what a function is actually doing, buried under all that safety net.

I’ll bet that the guard code is also the source of most bugs, ironically. We put it in there to catch stupid errors, and then the bug-catcher itself malfunctions. I don’t have any objective data to back that up, but that’s been my own experience. The real core of the program works fine; it’s all that other stuff in there to prop it up that’s problematic.

When you’re working on a tall ladder, it’s good practice to have a spotter below you. Someone who will – maybe not catch you, exactly, but at least call 911 when you face-plant on the pavement. They’re your safety net.

If we could do something similar with coding, we might get much faster programs and fewer bugs, besides. Let the “real” program run on one processor (or one CPU core of a multicore processor), while the “guard code” runs alongside on a parallel processor. One does the real work; the other checks that nothing is going off the rails. One parses input while the other checks boundary conditions. One calculates results while the other checks the validity of the input parameters. If the sidekick detects an error, we abort the process or restart the function or ask for new data.

Easy to say but hard to do, of course. But imagine how efficient – and fast! – your software would be if you didn’t have to idiot-check every single parameter, input buffer, string, and checksum. Imagine programming the way it used to be, when your only concern was making the program do what you wanted, not second-guessing all the things that could go wrong. That’s what spotters are for. We need code-spotters. That, and multicore processors, can be our ladders. 

One thought on “Ignore Those Pesky Bugs”

  1. I think it boils down to the necessary checks to bring up a new system without data corruption and memory faulting, vs what’s necessary for safe operation after deployment.

    A sane programmer is still using lot’s of asserts at bring up … to catch the stupid internal problems. And maybe even running those into initial releases with sane transparent logging and recovery.

    After that, trapping unexpected switch defaults with sane recovery, and similar exits for other unexpected state values, even into production with sane transparent logging and recovery is very low overhead.

    In most cases, the anal data checking belongs where data enters the system … data import, and user interfaces.

    Plus a good program to regularly “lint/fsck” ALL your data for corruption is almost mandatory for any production sanity. Contrary to other less clueful view, data does rot, and will become corrupted by a strange mix of both hardware failures and software failures, at some point. The best self defence for this is checksum/hashing ALL critical data records/elements … once you know the data was correct when written, and the checksum/hash match on read, it’s not necessary to sanity check all the data fields again.

Leave a Reply

featured blogs
Sep 30, 2022
When I wrote my book 'Bebop to the Boolean Boogie,' it was certainly not my intention to lead 6-year-old boys astray....
Sep 30, 2022
Wow, September has flown by. It's already the last Friday of the month, the last day of the month in fact, and so time for a monthly update. Kaufman Award The 2022 Kaufman Award honors Giovanni (Nanni) De Micheli of École Polytechnique Fédérale de Lausanne...
Sep 29, 2022
We explain how silicon photonics uses CMOS manufacturing to create photonic integrated circuits (PICs), solid state LiDAR sensors, integrated lasers, and more. The post What You Need to Know About Silicon Photonics appeared first on From Silicon To Software....

featured video

PCIe Gen5 x16 Running on the Achronix VectorPath Accelerator Card

Sponsored by Achronix

In this demo, Achronix engineers show the VectorPath Accelerator Card successfully linking up to a PCIe Gen5 x16 host and write data to and read data from GDDR6 memory. The VectorPath accelerator card featuring the Speedster7t FPGA is one of the first FPGAs that can natively support this interface within its PCIe subsystem. Speedster7t FPGAs offer a revolutionary new architecture that Achronix developed to address the highest performance data acceleration challenges.

Click here for more information about the VectorPath Accelerator Card

featured paper

Algorithm Verification with FPGAs and ASICs

Sponsored by MathWorks

Developing new FPGA and ASIC designs involves implementing new algorithms, which presents challenges for verification for algorithm developers, hardware designers, and verification engineers. This eBook explores different aspects of hardware design verification and how you can use MATLAB and Simulink to reduce development effort and improve the quality of end products.

Click here to read more

featured chalk talk

Power Multiplexing with Discrete Components

Sponsored by Mouser Electronics and Toshiba

Power multiplexing is a vital design requirement for a variety of different applications today. In this episode of Chalk Talk, Amelia Dalton chats with Talayeh Saderi from Toshiba about what kind of power multiplex solution would be the best fit for your next design. They discuss five unique design considerations that we should think about when it comes to power multiplexing and the benefits that high side gate drivers bring to power multiplexing.

Click here for more information about Toshiba Gate Driver + MOSFET for 5-24V Line Power MUX