feature article
Subscribe Now

Do Instruction Sets Matter?

As Assembly-Language Coding Dies, CPUs Become Interchangeable

Nerd trivia: Name the microprocessor with the “Old McDonald Instruction,” EIEIO.

Give up? Or more importantly, did you even understand the question?

There was a time—perhaps lost in the previous millennium—when embedded programmers could name the assembly-language mnemonics of just about any microprocessor. What’s more, they had strong opinions about which ones were best. “My MOV AH, AL is way better than your dumb chip’s MOV R3, R4 because x86 sets CF and ZF automatically.”

If that sounds Greek to you, congratulations: you’re an evolved programmer. But haggling over the finer points of assembly-language instructions and their side effects used to be an intrinsic part of embedded programming. Good embedded programming, anyway. It was important to know, for example, whether your chip modified the carry, overflow, or zero flags during every MOV operation or only during arithmetic instructions.

Those days are pretty much gone, though. Effective programmers now use higher-level languages like C, Java, or LabView and leave the assembly-language stuff to the compiler. In a series of surveys conducted over a number of years, the number of assembly-language programmers steadily declined, while the number of C and C++ programmers increased. And the rate of assembly’s decline was about equal to the rate of C’s increase, suggesting that old assembly-language programmers are switching languages. Either that, or they’re dying off and being replaced by younger C programmers.

Either way, assembly-language programming is a disappearing art, like flipping front-panel DIP switches. Nobody mourns the demise of binary “programming” by toggling switches (at least, I hope not), so it should be no big deal that we’re moving away from mnemonics, green bar printouts, and cycle counting.

And yet…

It seems like good, effective embedded programming should include some measure of assembly-language principles. Understanding the low-level workings of your chip helps you create better, more efficient, and more reliable code. As least, I think so. Even if you don’t write much assembly code, you should be able to read and understand it, so that when your development system pukes and the debugger spits out a trace you can tell what you’re looking at. Maybe that old carry flag got changed by a rogue arithmetic instruction.

But let’s leave programming practice aside for now. Do instruction sets matter after the product is finished? That is, after the engineering is done, the widget is in mass production, and the customers are happy. Does the underlying instruction set matter then?

Probably not, unless your gizmo requires aftermarket software. If you’re never going to update the code or add new applications, it probably doesn’t matter whether your embedded system is based on an 8051, a Core 2 Duo, or a wound-up rubber band. Nobody’s going to see the software and nothing else needs to interface with it. In that case, it’s a truly embedded system: a black box.

But what about systems that do get aftermarket upgrades or updates? Does the underlying CPU instruction set matter then? It does, inasmuch as your software updates need to be distributed in the correct binary format. You obviously can’t run 8051 binaries on an x86 chip, for instance. But that doesn’t necessarily affect your development. Even if the original product was developed using assembly-language programming tools, there’s no reason you can’t write the updates in C and compile them down to 8051 binaries. The instruction set matters, but only for compatibility.

Instruction sets never die; they don’t even fade away. Old chips like the 8051 or 6805 still have good software support. Even the original 4004 has multiple compilers and debuggers, though it’s hard to see why you’d want them. So there’s little chance of your instruction set being orphaned or abandoned over time. Once it’s in, it’s in for life.

Some developers try to hide the instruction set through virtualization. Java, C#, and other “virtual runtime environments” endeavor to mask the underlying hardware, replacing it with an imaginary CPU and an imaginary instruction set. The idea here is to create generic hardware that can run generic software. Or more accurately, to create generic software that can run on any generic hardware without porting or recompiling. There are plenty of problems with this, including breathtakingly poor performance. Virtualization also requires a lot of memory and a lot of CPU horsepower, two things that embedded systems typically do not have in abundance. It can be made to work, but as Mark Twain is reputed to have observed about forcing a pig to sing, “it frustrates you and annoys the pig.”

Do instruction sets affect performance? You betcha. Different processors really do behave differently on the same software workload, so choosing the right processor can make a big difference to your code’s performance. But “right” in this case may not mean the fastest chip, or even the one with the best price/performance tradeoff. The best chip (as opposed to the fastest one) may be the one that has the best software support, compiler, or operating systems. Yes, Ferraris are faster than VWs, but the Vee-Dub is a lot easier to get parts for. 

Do engineers evaluate instruction sets when they choose a CPU for their next project? Hardly ever. For starters, most programmers just don’t care. They’ll most likely be programming in C, so the instruction set is the compiler’s problem, not theirs. Besides, what would they be looking for? How would you evaluate one instruction set versus another? Do you look for an EIEIO instruction? With no clear criteria for judging what’s good or bad, there’s not much point in carrying out the examination. Don’t look a gift horse in the mouth if you don’t know much about horses.

Eventually, gradually, ultimately, we’ll get to the point where CPU instruction sets are a mysterious and irrelevant technical detail, like the source of the rubber in your car’s tires. Somebody somewhere probably knows, but the rest of us won’t care. We’ll develop our code in blissful ignorance of the machinations below the surface. We’ll deal in higher levels of abstraction and we’ll be more productive because of it. That’s as it should be. That’s progress. But I’ll still miss the old assembly.

By the way, the EIEIO instruction appeared on most early PowerPC chips. The mnemonic stood for Enforce In-Order Execution of Input/Output and basically flushed the chip’s load/store queue before performing external I/O reads or writes, with the purpose of avoiding stale data. And if none of that made sense, well, then one of us is in the wrong business. 

14 thoughts on “Do Instruction Sets Matter?”

  1. There are times when Assembly Language is a life saver. An project’s software is running at half the speed you need, some assembler in critical routines can make all the difference. Just hang onto the “C” (or other language) for the next version with a faster CPU.

    Peter.

    PS. Yes the PowerPC, I’m sure the mnemonic designer chose eieio for it’s tunefulness.

  2. The title of this article is almost shocking (at least if you have been in the embedded business for 25 years). Of course it matters. Some asm reads like a book (and are efficient), some read like a riddle (and you don’t know what’s gone happen).
    Compilers are assumed to have made this obsolete. But this assumes that the (C) compiler can easily be mapped to the IS. Then comes HW point optimisations like long pipelines and all kind of weird instructions or strange addressing schemes (remember the 56K DSP?).
    Now, while it makes sense (for portability) to stick to C as much as possible, a few inlines or ASM macros can easily make 20 to 50% difference in code size and performance. But that only works when you know what the compiler and the ASM are doing.
    Just remember that most HW designers have a clue about the software. Why did an ATM packet have 53 bytes? Because for the HW it doesn’t matter. But in SW, you need to align things all the time.

  3. Low level language is what basically any hardware can understand. High level language such as C or C++ ultimately converts the code into Assembly instructions which the hardware can understand. Tomorrow if there is a hardware that is designed to understand and act on high level languages directly, then no need for any more compilers to translate into instructions just to check the language syntax only is needed. Till that time this dependency can not be avoided. It is correct what Jim was trying to mention in this article and in my view the current limitation is totally because of the hardware design philosophy.

  4. Pingback: Music Xray Scam
  5. Pingback: GVK Biosciences
  6. Pingback: trader werden
  7. Pingback: Political Diyala
  8. Pingback: Best DMPK Studies
  9. Pingback: iraqi coehuman
  10. Pingback: satta matka

Leave a Reply

featured blogs
Apr 20, 2018
https://youtu.be/h5Hs6zxcALc Coming from RSA Conference, Moscone West, San Francisco (camera Jessamine McLellan) Monday: TSMC Technology Symposium Preview Tuesday: Linley: Training in the Datacenter, Inference at the Edge Wednesday: RSA Cryptographers' Panel Thursday: Qu...
Apr 19, 2018
COBO, Silicon Photonics Kevin Burt, Application Engineer at the Samtec Optical Group, walks us through a demonstration using silicon photonics and the COBO (Consortium of Onboard Optics) form factor.  This was displayed at OFC 2018 in San Diego, California. Samtec is a fable...