feature article
Subscribe Now

Ignore Those Pesky Bugs

Software is Complicated, But How Much of it is Useful?

“We also know there are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns – the ones we don’t know we don’t know.” – Donald Rumsfeld

Consider the humble ladder. It’s a hardware device that’s elegant in its simplicity. Two parallel side rails, with evenly spaced rungs in between. Everything you need; nothing you don’t. Ladders can be made of wood, metal, Fiberglas, or other materials. There are long ones, short ones, portable ones, and permanent ones. Nobody really needs to be taught how to use a ladder, although there are some standard safety rules that might make your tenure at the top a bit more secure. 

Now consider your computer’s word-processing software. Or the email application on your smartphone. Not so simple, are they? In fact, there are a lot of features in most programs that seem to be superfluous. Simple, elegant, just-the-basics apps seem to be the exception. They get good reviews, but people remark on them like they’re some sort of aberration. Maybe they are. “Creeping featurism” is a term that’s almost as old as engineering itself.

But that’s not my point. We don’t need to flog the poor software galley slaves chained to their oars below decks. (That’s what managers are for.) No, the question for today is, “How much of your code does real work, as opposed to just catching errors?”

First-time programmers just starting out learn to create “hello, world!” or something similar. They get the feel for what it’s like to use a programming language to describe what they want and make it into compile-worthy code. At first, they work toward getting their early programs to obey their wishes, pure and simple. Bugs will creep in, sure, but that’s all part of the learning process.

But after the first few successful attempts, we start to learn the other side of programing: the part where you shore up the program to prevent it from failing in the real world. You start to build the safety net, the guard code. You’re no longer creating a program that does what you want. You’re creating additional code that prevents it from doing what you don’t want, and that’s a different process and a different mindset altogether.

What does the program do if the user accidentally types in a bogus date? What happens if it receives a malformed network packet? How does it behave if a pointer is out of bounds? These are all aspects of the “guard code” that we all have to include, even though it doesn’t add anything to the program and it doesn’t (usually) do any useful work. Oftentimes, it’s never called or executed at all. Guard code is there just to keep the program from tipping over in case something stupid happens.

Even though ladders are pretty safe, they don’t have safety features, per se. There’s no airbag at the bottom that deploys if you fall. There are (usually) no outriggers to prevent tip-overs. Ladders don’t have built-in current sensors to prevent you from using an aluminum ladder on power lines. There are no accelerometers or klaxons to alert you to unstable working angles. I once saw a ladder with a built-in bubble level to help you eyeball the slope, but that’s about it.

Programming isn’t like that. We actually spend a lot of our time adding safety features to our software. It’s like training wheels on a bicycle, except that they never come off. The guard code is always there, ready to catch that malformed packet or that bogus date, even though it may never happen.

On top of all that, we also have to guard against malicious intent, not just dumb mistakes. What if someone deliberately tries to break our program by shrewdly exploiting some weakness in the input buffer? You’ve got to guard against attacks, not just bugs. 

And we have to add security features. It’s harder than ever to make software hacker-proof, because the hackers keep getting wilier and craftier. There are accidental bugs, and then there are malicious assaults, and we generally can’t catch them both with the same kind of guard code. You have to consciously look for, and trap, both types: the known unknowns and the unknown unknowns.

So how much of today’s code falls into that “guard code” category, versus the amount that does the real work implementing the program’s putative purpose? Any guesses?

I would take a SWAG that guard code accounts for 75 percent of most modern programs. It’s got to be at least half. Looking at big chunks of open-source code, I see an awful lot of source code that’s there just to trap errors, mistakes, user flubs, and similar non-malicious bugs. It’s sometimes hard to see what a function is actually doing, buried under all that safety net.

I’ll bet that the guard code is also the source of most bugs, ironically. We put it in there to catch stupid errors, and then the bug-catcher itself malfunctions. I don’t have any objective data to back that up, but that’s been my own experience. The real core of the program works fine; it’s all that other stuff in there to prop it up that’s problematic.

When you’re working on a tall ladder, it’s good practice to have a spotter below you. Someone who will – maybe not catch you, exactly, but at least call 911 when you face-plant on the pavement. They’re your safety net.

If we could do something similar with coding, we might get much faster programs and fewer bugs, besides. Let the “real” program run on one processor (or one CPU core of a multicore processor), while the “guard code” runs alongside on a parallel processor. One does the real work; the other checks that nothing is going off the rails. One parses input while the other checks boundary conditions. One calculates results while the other checks the validity of the input parameters. If the sidekick detects an error, we abort the process or restart the function or ask for new data.

Easy to say but hard to do, of course. But imagine how efficient – and fast! – your software would be if you didn’t have to idiot-check every single parameter, input buffer, string, and checksum. Imagine programming the way it used to be, when your only concern was making the program do what you wanted, not second-guessing all the things that could go wrong. That’s what spotters are for. We need code-spotters. That, and multicore processors, can be our ladders. 

One thought on “Ignore Those Pesky Bugs”

  1. I think it boils down to the necessary checks to bring up a new system without data corruption and memory faulting, vs what’s necessary for safe operation after deployment.

    A sane programmer is still using lot’s of asserts at bring up … to catch the stupid internal problems. And maybe even running those into initial releases with sane transparent logging and recovery.

    After that, trapping unexpected switch defaults with sane recovery, and similar exits for other unexpected state values, even into production with sane transparent logging and recovery is very low overhead.

    In most cases, the anal data checking belongs where data enters the system … data import, and user interfaces.

    Plus a good program to regularly “lint/fsck” ALL your data for corruption is almost mandatory for any production sanity. Contrary to other less clueful view, data does rot, and will become corrupted by a strange mix of both hardware failures and software failures, at some point. The best self defence for this is checksum/hashing ALL critical data records/elements … once you know the data was correct when written, and the checksum/hash match on read, it’s not necessary to sanity check all the data fields again.

Leave a Reply

featured blogs
Apr 14, 2021
You put your design through a multitude of tools for various transformations. Going back to formal verification in between every change to rely on your simulation tools can be a rigorous approach,... [[ Click on the title to access the full blog on the Cadence Community site...
Apr 14, 2021
Hybrid Cloud architecture enables innovation in AI chip design; learn how our partnership with IBM combines the best in EDA & HPC to improve AI performance. The post Synopsys and IBM Research: Driving Real Progress in Large-Scale AI Silicon and Implementing a Hybrid Clou...
Apr 13, 2021
The human brain is very good at understanding the world around us.  An everyday example can be found when driving a car.  An experienced driver will be able to judge how large their car is, and how close they can approach an obstacle.  The driver does not need ...
Apr 12, 2021
The Semiconductor Ecosystem- It is the definition of 'High Tech', but it isn't just about… The post Calibre and the Semiconductor Ecosystem appeared first on Design with Calibre....

featured video

The Verification World We Know is About to be Revolutionized

Sponsored by Cadence Design Systems

Designs and software are growing in complexity. With verification, you need the right tool at the right time. Cadence® Palladium® Z2 emulation and Protium™ X2 prototyping dynamic duo address challenges of advanced applications from mobile to consumer and hyperscale computing. With a seamlessly integrated flow, unified debug, common interfaces, and testbench content across the systems, the dynamic duo offers rapid design migration and testing from emulation to prototyping. See them in action.

Click here for more information

featured paper

From Chips to Ships, Solve Them All With HFSS

Sponsored by Ansys

There are virtually no limits to the design challenges that can be solved with Ansys HFSS and the new HFSS Mesh Fusion technology! Check out this blog to know what the latest innovation in HFSS 2021 can do for you.

Click here to read the blog post

featured chalk talk

UWB: Because Location Matters

Sponsored by Mouser Electronics and Qorvo

While technologies like GPS, WiFi, and Bluetooth all offer various types of location services, none of them are well-suited to providing accurate, indoor/outdoor, low-power, real-time, 3D location data for edge and endpoint devices. In this episode of Chalk Talk, Amelia Dalton chats with Mickael Viot from Qorvo about ultra-wideband (UWB) technology, and how it can revolutionize a wide range of applications.

Click here for more information about Qorvo Ultra-Wideband (UWB) Technology