feature article
Subscribe Now

Bright Copper Kettles and Warm Woolen Mittens

More Favorite Things About Embedded x86 Programming

I’m atoning for my sins.

Following my beat-down on Microsoft, which was gratifyingly therapeutic, I’m making up for it by being nice to Intel and AMD, long-time associates of the Great Beast of Redmond.

A lot of embedded programmers shun x86 chips because they equate them with Windows. “Windows and x86” are like Facebook and Zuckerberg, Ferrari and Italy, Wall Street and corruption. They just go together. But they don’t have to. It should be obvious, but x86 chips don’t always run Windows, or any other mainstream desktop operating system. There are plenty of RTOS and real-time kernel choices, too.

When I first started programming x86 chips for embedded systems, the first thing I had to wrap my head around was the memory segmentation. After that, I started to get the hang of task management. You see, x86 chips (starting with the ’386) have built-in hardware task management, kind of like a silicon RTOS. It’s not a full operating system by any means, but it’s a remarkably advanced and useful feature for managing your own tasks. I’d never seen anything like this before in a microprocessor, and I wound up using it instead of a commercial kernel – all for free.

Here’s how it works: As a programmer, you get to define different portions of your code as “tasks.” A task can be anything you want it to be – there’s no fixed definition – but a logical way would be to identify as tasks independent sections of code such as library functions, big subroutines, control loops, or anything else that seems more-or-less self-contained.

Once you’ve figured out what and where your tasks are, you tell the chip. That is, you specify what code, data, and stack segments the task can use, plus the contents of all the on-chip registers. This information all goes into a 104-byte structure called a Task State Segment (TSS). The TSS is basically an image of the stack: it’s what you want the machine state to be when it switches to that task.

Each task gets its own TSS, and presumably each TSS is a bit different. Different code segments, different data registers, different stack pointer, and so on. In a typical operating system’s task manager, these would all be separate software stacks. On an x86 chip, they’re separate TSS structures in memory.

Once you’ve initialized all your TSS structures, you just point the processor at the first one and say, go! The x86 chip will slurp up all the register and state information and start executing where the TSS tells it to go. And then it keeps going, and going, and going…

Starting the first task is only the beginning. What causes tasks to change? Ah, that’s up to you. Task switches can be caused by interrupts, faults, time slices, or whatever you want, but you have to make it happen. The processor doesn’t have a hard-and-fast rule about when and how to change tasks. Only a mechanism for allowing you to do so.

Let’s say you want a dead-simple arrangement that just toggles back and forth between two tasks, and you use a timer interrupt as the trigger. As usual, the CPU jumps to the interrupt handler, but here’s where it gets clever. Instead of pointing to a normal interrupt-service routine (ISR), you instead direct the interrupt to a TSS structure. The processor will recognize the target as a task structure and start the whole process of task-switching. All of the chip’s current state will be stored into the current TSS, and an all-new state will be loaded from the incoming TSS. Virtually every register in the chip will change, all in one fell swoop. The incoming task will be reawakened in exactly the condition in which it was left the last time it ran. In fact, the incoming task won’t even know that it was ever asleep. If you want to pass information between tasks, you’ll have to work hard at it. Task switching is designed specifically to keep tasks as independent and separate as possible.

On the next timer interrupt, the current task state is dumped into the TSS, the previous task state is loaded, and the chip is back to where it started before the initial task switch. Again, the incoming task won’t know that it was ever suspended; it just picks up where it left off.

In a less trivial system, you’d probably want lots of separate tasks, and some sort of arbiter to decide which tasks get awakened when. But the fundamental process is still the same. An interrupt or hardware event that would normally trigger an ISR instead jumps to a whole new task. This is a slick way to handle system faults, for example, or other important functions. For one, it has the advantage of starting the fault handler in its own separate state, untainted by the code that caused the fault. It also avoids contaminating the evidence when you’re trying to find out why the faulty code crashed in the first place. A swift task switch will leave the guilty code in exactly the state it was in when it faulted, and start the fault handler in its own, squeaky clean, state.

Instead of a round-robin task handler, you can also have one task directly switch to another. That is, one task can deliberately put itself to sleep and awaken another task in its place. This is sort of a “super jump” to another piece of code, with the side effect of swapping the entire processor state instead of just a few registers. As before, it’s hard for the outgoing task to pass any parameters to the incoming task, but sometimes that’s what you want. The only restriction here is that you must “unwind” the call chain, because tasks are not reentrant and task switches cannot be recursive.

One of the nice things about using tasks is that you never have to worry about saving or restoring registers. The task switch itself handles all of that, so you don’t need to push or pop parameters or preserve anything. The housekeeping has been done for you, and your subroutines are all set up before the code ever starts to run. All the standard preamble and cleanup code that C compilers insert at the beginnings and ends of functions are superfluous.

Once I got the hang of task switching, I became a big fan. My own debugger used this feature heavily, in part because it automatically preserved the state of a program exactly at the point of failure. I didn’t have to worry about preserving all that state information or try to recreate the conditions of the failure. My postmortem analysis could begin just by looking at the outgoing TSS.

The other benefit was that I got a free operating system. Well, sort of. My first round-robin task scheduler took up a whopping 28 bytes of code. All the rest was handled in hardware. Later on, I made the scheduler itself a task, and the code was still smaller than the TSS that defined it. Bonus!

2 thoughts on “Bright Copper Kettles and Warm Woolen Mittens”

  1. I guess this is exactly how Linus started with Linux twenty-odd years ago with his now famous ABABABA task switching test.
    Cannot imagine anyone at Intel could have foreseen the impact of that feature when they came up with it.

Leave a Reply

featured blogs
May 17, 2022
Introduction Healing geometry, setting the domain, grid discretization, solution initialization, and result visualization, are steps in a CFD workflow. Preprocessing, if done efficiently, is the... ...
May 12, 2022
Our PCIe 5.0 IP solutions, including digital controllers and PHYs, have passed PCI-SIG 5.0 compliance testing, becoming the first on the 5.0 integrators list. The post Synopsys IP Passes PCIe 5.0 Compliance and Makes Integrators List appeared first on From Silicon To Softwar...
May 12, 2022
By Shelly Stalnaker Every year, the editors of Elektronik in Germany compile a list of the most interesting and innovative… ...
Apr 29, 2022
What do you do if someone starts waving furiously at you, seemingly delighted to see you, but you fear they are being overenthusiastic?...

featured video

EdgeQ Creates Big Connections with a Small Chip

Sponsored by Cadence Design Systems

Find out how EdgeQ delivered the world’s first 5G base station on a chip using Cadence’s logic simulation, digital implementation, timing and power signoff, synthesis, and physical verification signoff tools.

Click here for more information

featured paper

Reduce EV cost and improve drive range by integrating powertrain systems

Sponsored by Texas Instruments

When you can create automotive applications that do more with fewer parts, you’ll reduce both weight and cost and improve reliability. That’s the idea behind integrating electric vehicle (EV) and hybrid electric vehicle (HEV) designs.

Click to read more

featured chalk talk

"Scalable Power Delivery" for High-Performance ASICs, SoCs, and xPUs

Sponsored by Infineon

Today’s AI and Networking applications are driving an exponential increase in compute power. When it comes to scaling power for these kinds of applications with next generation chipsets, we need to keep in mind package size constraints, dynamic current balancing, and output capacitance. In this episode of Chalk Talk, Mark Rodrigues from Infineon joins Amelia Dalton to discuss the system design challenges with increasing power density for next generation chipsets, the benefits that phase paralleling brings to the table, and why Infineon’s best in class transient performance with XDP architecture and Trans Inductor Voltage Regulator can help power  your next high performance ASIC, SoC or xPU design.

Click here for more information about computing and data storage from Infineon