feature article
Subscribe Now

Bright Copper Kettles and Warm Woolen Mittens

More Favorite Things About Embedded x86 Programming

I’m atoning for my sins.

Following my beat-down on Microsoft, which was gratifyingly therapeutic, I’m making up for it by being nice to Intel and AMD, long-time associates of the Great Beast of Redmond.

A lot of embedded programmers shun x86 chips because they equate them with Windows. “Windows and x86” are like Facebook and Zuckerberg, Ferrari and Italy, Wall Street and corruption. They just go together. But they don’t have to. It should be obvious, but x86 chips don’t always run Windows, or any other mainstream desktop operating system. There are plenty of RTOS and real-time kernel choices, too.

When I first started programming x86 chips for embedded systems, the first thing I had to wrap my head around was the memory segmentation. After that, I started to get the hang of task management. You see, x86 chips (starting with the ’386) have built-in hardware task management, kind of like a silicon RTOS. It’s not a full operating system by any means, but it’s a remarkably advanced and useful feature for managing your own tasks. I’d never seen anything like this before in a microprocessor, and I wound up using it instead of a commercial kernel – all for free.

Here’s how it works: As a programmer, you get to define different portions of your code as “tasks.” A task can be anything you want it to be – there’s no fixed definition – but a logical way would be to identify as tasks independent sections of code such as library functions, big subroutines, control loops, or anything else that seems more-or-less self-contained.

Once you’ve figured out what and where your tasks are, you tell the chip. That is, you specify what code, data, and stack segments the task can use, plus the contents of all the on-chip registers. This information all goes into a 104-byte structure called a Task State Segment (TSS). The TSS is basically an image of the stack: it’s what you want the machine state to be when it switches to that task.

Each task gets its own TSS, and presumably each TSS is a bit different. Different code segments, different data registers, different stack pointer, and so on. In a typical operating system’s task manager, these would all be separate software stacks. On an x86 chip, they’re separate TSS structures in memory.

Once you’ve initialized all your TSS structures, you just point the processor at the first one and say, go! The x86 chip will slurp up all the register and state information and start executing where the TSS tells it to go. And then it keeps going, and going, and going…

Starting the first task is only the beginning. What causes tasks to change? Ah, that’s up to you. Task switches can be caused by interrupts, faults, time slices, or whatever you want, but you have to make it happen. The processor doesn’t have a hard-and-fast rule about when and how to change tasks. Only a mechanism for allowing you to do so.

Let’s say you want a dead-simple arrangement that just toggles back and forth between two tasks, and you use a timer interrupt as the trigger. As usual, the CPU jumps to the interrupt handler, but here’s where it gets clever. Instead of pointing to a normal interrupt-service routine (ISR), you instead direct the interrupt to a TSS structure. The processor will recognize the target as a task structure and start the whole process of task-switching. All of the chip’s current state will be stored into the current TSS, and an all-new state will be loaded from the incoming TSS. Virtually every register in the chip will change, all in one fell swoop. The incoming task will be reawakened in exactly the condition in which it was left the last time it ran. In fact, the incoming task won’t even know that it was ever asleep. If you want to pass information between tasks, you’ll have to work hard at it. Task switching is designed specifically to keep tasks as independent and separate as possible.

On the next timer interrupt, the current task state is dumped into the TSS, the previous task state is loaded, and the chip is back to where it started before the initial task switch. Again, the incoming task won’t know that it was ever suspended; it just picks up where it left off.

In a less trivial system, you’d probably want lots of separate tasks, and some sort of arbiter to decide which tasks get awakened when. But the fundamental process is still the same. An interrupt or hardware event that would normally trigger an ISR instead jumps to a whole new task. This is a slick way to handle system faults, for example, or other important functions. For one, it has the advantage of starting the fault handler in its own separate state, untainted by the code that caused the fault. It also avoids contaminating the evidence when you’re trying to find out why the faulty code crashed in the first place. A swift task switch will leave the guilty code in exactly the state it was in when it faulted, and start the fault handler in its own, squeaky clean, state.

Instead of a round-robin task handler, you can also have one task directly switch to another. That is, one task can deliberately put itself to sleep and awaken another task in its place. This is sort of a “super jump” to another piece of code, with the side effect of swapping the entire processor state instead of just a few registers. As before, it’s hard for the outgoing task to pass any parameters to the incoming task, but sometimes that’s what you want. The only restriction here is that you must “unwind” the call chain, because tasks are not reentrant and task switches cannot be recursive.

One of the nice things about using tasks is that you never have to worry about saving or restoring registers. The task switch itself handles all of that, so you don’t need to push or pop parameters or preserve anything. The housekeeping has been done for you, and your subroutines are all set up before the code ever starts to run. All the standard preamble and cleanup code that C compilers insert at the beginnings and ends of functions are superfluous.

Once I got the hang of task switching, I became a big fan. My own debugger used this feature heavily, in part because it automatically preserved the state of a program exactly at the point of failure. I didn’t have to worry about preserving all that state information or try to recreate the conditions of the failure. My postmortem analysis could begin just by looking at the outgoing TSS.

The other benefit was that I got a free operating system. Well, sort of. My first round-robin task scheduler took up a whopping 28 bytes of code. All the rest was handled in hardware. Later on, I made the scheduler itself a task, and the code was still smaller than the TSS that defined it. Bonus!

2 thoughts on “Bright Copper Kettles and Warm Woolen Mittens”

  1. I guess this is exactly how Linus started with Linux twenty-odd years ago with his now famous ABABABA task switching test.
    Cannot imagine anyone at Intel could have foreseen the impact of that feature when they came up with it.

Leave a Reply

featured blogs
Apr 25, 2024
Cadence's seven -year partnership with'¯ Team4Tech '¯has given our employees unique opportunities to harness the power of technology and engage in a three -month philanthropic project to improve the livelihood of communities in need. In Fall 2023, this partnership allowed C...
Apr 24, 2024
Learn about maskless electron beam lithography and see how Multibeam's industry-first e-beam semiconductor lithography system leverages Synopsys software.The post Synopsys and Multibeam Accelerate Innovation with First Production-Ready E-Beam Lithography System appeared fir...
Apr 18, 2024
Are you ready for a revolution in robotic technology (as opposed to a robotic revolution, of course)?...

featured video

MaxLinear Integrates Analog & Digital Design in One Chip with Cadence 3D Solvers

Sponsored by Cadence Design Systems

MaxLinear has the unique capability of integrating analog and digital design on the same chip. Because of this, the team developed some interesting technology in the communication space. In the optical infrastructure domain, they created the first fully integrated 5nm CMOS PAM4 DSP. All their products solve critical communication and high-frequency analysis challenges.

Learn more about how MaxLinear is using Cadence’s Clarity 3D Solver and EMX Planar 3D Solver in their design process.

featured paper

Designing Robust 5G Power Amplifiers for the Real World

Sponsored by Keysight

Simulating 5G power amplifier (PA) designs at the component and system levels with authentic modulation and high-fidelity behavioral models increases predictability, lowers risk, and shrinks schedules. Simulation software enables multi-technology layout and multi-domain analysis, evaluating the impacts of 5G PA design choices while delivering accurate results in a single virtual workspace. This application note delves into how authentic modulation enhances predictability and performance in 5G millimeter-wave systems.

Download now to revolutionize your design process.

featured chalk talk

Advantech Industrial AI Camera: Small but Mighty
Sponsored by Mouser Electronics and Advantech
Artificial intelligence equipped camera systems can be a great addition to a variety of industrial designs. In this episode of Chalk Talk, Amelia Dalton and Ryan Chan from Advantech explore the components included in an industrial AI camera system, the benefits of Advantech’s AI ICAM-500 Industrial camera series and how you can get started using these solutions in your next industrial design. 
Aug 23, 2023
29,357 views