feature article
Subscribe Now

Bright Copper Kettles and Warm Woolen Mittens

More Favorite Things About Embedded x86 Programming

I’m atoning for my sins.

Following my beat-down on Microsoft, which was gratifyingly therapeutic, I’m making up for it by being nice to Intel and AMD, long-time associates of the Great Beast of Redmond.

A lot of embedded programmers shun x86 chips because they equate them with Windows. “Windows and x86” are like Facebook and Zuckerberg, Ferrari and Italy, Wall Street and corruption. They just go together. But they don’t have to. It should be obvious, but x86 chips don’t always run Windows, or any other mainstream desktop operating system. There are plenty of RTOS and real-time kernel choices, too.

When I first started programming x86 chips for embedded systems, the first thing I had to wrap my head around was the memory segmentation. After that, I started to get the hang of task management. You see, x86 chips (starting with the ’386) have built-in hardware task management, kind of like a silicon RTOS. It’s not a full operating system by any means, but it’s a remarkably advanced and useful feature for managing your own tasks. I’d never seen anything like this before in a microprocessor, and I wound up using it instead of a commercial kernel – all for free.

Here’s how it works: As a programmer, you get to define different portions of your code as “tasks.” A task can be anything you want it to be – there’s no fixed definition – but a logical way would be to identify as tasks independent sections of code such as library functions, big subroutines, control loops, or anything else that seems more-or-less self-contained.

Once you’ve figured out what and where your tasks are, you tell the chip. That is, you specify what code, data, and stack segments the task can use, plus the contents of all the on-chip registers. This information all goes into a 104-byte structure called a Task State Segment (TSS). The TSS is basically an image of the stack: it’s what you want the machine state to be when it switches to that task.

Each task gets its own TSS, and presumably each TSS is a bit different. Different code segments, different data registers, different stack pointer, and so on. In a typical operating system’s task manager, these would all be separate software stacks. On an x86 chip, they’re separate TSS structures in memory.

Once you’ve initialized all your TSS structures, you just point the processor at the first one and say, go! The x86 chip will slurp up all the register and state information and start executing where the TSS tells it to go. And then it keeps going, and going, and going…

Starting the first task is only the beginning. What causes tasks to change? Ah, that’s up to you. Task switches can be caused by interrupts, faults, time slices, or whatever you want, but you have to make it happen. The processor doesn’t have a hard-and-fast rule about when and how to change tasks. Only a mechanism for allowing you to do so.

Let’s say you want a dead-simple arrangement that just toggles back and forth between two tasks, and you use a timer interrupt as the trigger. As usual, the CPU jumps to the interrupt handler, but here’s where it gets clever. Instead of pointing to a normal interrupt-service routine (ISR), you instead direct the interrupt to a TSS structure. The processor will recognize the target as a task structure and start the whole process of task-switching. All of the chip’s current state will be stored into the current TSS, and an all-new state will be loaded from the incoming TSS. Virtually every register in the chip will change, all in one fell swoop. The incoming task will be reawakened in exactly the condition in which it was left the last time it ran. In fact, the incoming task won’t even know that it was ever asleep. If you want to pass information between tasks, you’ll have to work hard at it. Task switching is designed specifically to keep tasks as independent and separate as possible.

On the next timer interrupt, the current task state is dumped into the TSS, the previous task state is loaded, and the chip is back to where it started before the initial task switch. Again, the incoming task won’t know that it was ever suspended; it just picks up where it left off.

In a less trivial system, you’d probably want lots of separate tasks, and some sort of arbiter to decide which tasks get awakened when. But the fundamental process is still the same. An interrupt or hardware event that would normally trigger an ISR instead jumps to a whole new task. This is a slick way to handle system faults, for example, or other important functions. For one, it has the advantage of starting the fault handler in its own separate state, untainted by the code that caused the fault. It also avoids contaminating the evidence when you’re trying to find out why the faulty code crashed in the first place. A swift task switch will leave the guilty code in exactly the state it was in when it faulted, and start the fault handler in its own, squeaky clean, state.

Instead of a round-robin task handler, you can also have one task directly switch to another. That is, one task can deliberately put itself to sleep and awaken another task in its place. This is sort of a “super jump” to another piece of code, with the side effect of swapping the entire processor state instead of just a few registers. As before, it’s hard for the outgoing task to pass any parameters to the incoming task, but sometimes that’s what you want. The only restriction here is that you must “unwind” the call chain, because tasks are not reentrant and task switches cannot be recursive.

One of the nice things about using tasks is that you never have to worry about saving or restoring registers. The task switch itself handles all of that, so you don’t need to push or pop parameters or preserve anything. The housekeeping has been done for you, and your subroutines are all set up before the code ever starts to run. All the standard preamble and cleanup code that C compilers insert at the beginnings and ends of functions are superfluous.

Once I got the hang of task switching, I became a big fan. My own debugger used this feature heavily, in part because it automatically preserved the state of a program exactly at the point of failure. I didn’t have to worry about preserving all that state information or try to recreate the conditions of the failure. My postmortem analysis could begin just by looking at the outgoing TSS.

The other benefit was that I got a free operating system. Well, sort of. My first round-robin task scheduler took up a whopping 28 bytes of code. All the rest was handled in hardware. Later on, I made the scheduler itself a task, and the code was still smaller than the TSS that defined it. Bonus!

2 thoughts on “Bright Copper Kettles and Warm Woolen Mittens”

  1. I guess this is exactly how Linus started with Linux twenty-odd years ago with his now famous ABABABA task switching test.
    Cannot imagine anyone at Intel could have foreseen the impact of that feature when they came up with it.

Leave a Reply

featured blogs
Aug 16, 2018
Learn about the challenges and solutions for integrating and verification PCIe(r) Gen4 into an Arm-Based Server SoC. Listen to this relatively short webinar by Arm and Cadence, as they describe the collaboration and results, including methodology and technology for speeding i...
Aug 16, 2018
All of the little details were squared up when the check-plots came out for "final" review. Those same preliminary files were shared with the fab and assembly units and, of course, the vendors have c...
Aug 15, 2018
VITA 57.4 FMC+ Standard As an ANSI/VITA member, Samtec supports the release of the new ANSI/VITA 57.4-2018 FPGA Mezzanine Card Plus Standard. VITA 57.4, also referred to as FMC+, expands upon the I/O capabilities defined in ANSI/VITA 57.1 FMC by adding two new connectors that...
Aug 14, 2018
I worked at HP in Ft. Collins, Colorado back in the 1970s. It was a heady experience. We were designing and building early, pre-PC desktop computers and we owned the market back then. The division I worked for eventually migrated to 32-bit workstations, chased from the deskto...
Jul 30, 2018
As discussed in part 1 of this blog post, each instance of an Achronix Speedcore eFPGA in your ASIC or SoC design must be configured after the system powers up because Speedcore eFPGAs employ nonvolatile SRAM technology to store its configuration bits. The time required to pr...