“The fault… is not in our stars, but in ourselves…” Julius Caesar, I.III
Very little code is required to make an x86 processor handle task switching for you, but there is a certain amount of pregame setup you’ll have to do. This is mostly a matter of plugging values into memory so that everything is just right before you throw the big switch and start the machine running.
Sadly, there are a lot of things that can go wrong when setting up your tasks, and most of them are hard to diagnose. If the hardware is unhappy about the way you’ve configured your task state segments (TSS), task gates, or other structures, you’ll usually get a General Protection Fault. The hint is right there in the name: it’s a general fault, with no further details. Here’s how to avoid those.
Setting Up the TSS
Your first step is to decide where each task’s frozen state (that is, its task state segment, or TSS) will be stored. Each TSS requires exactly 104 bytes, but you can make it larger if you want to. You don’t even have to decide now what to use the extra space for; the processor simply reads the first 104 bytes and ignores the rest. For now, let’s keep things simple and use the standard default size.
You’ll need to create at least two TSS’s. You can’t switch between tasks if there’s only one task! Two is the absolute minimum and a good place to start. Later, you can try making a third task to manage the other two, but no hurry. The processor can manage thousands of tasks, so there’s no practical upper limit.
It’s important to pre-stuff the contents of each TSS with reasonable values. Some values are mandatory to make things work, some are important but discretionary, and some don’t matter at all.
Preloading a shiny new TSS means you need to be able to write to it as data. If one of your data segments (DS, ES, FS, or GS) is already set up to encompass the memory area where a TSS resides, this is no problem. If not, you’ll need to create a data segment that covers the TSS. A good practice is to define the data segment’s base address and length to exactly match those of the TSS so there’s little danger of accidentally writing something beyond its boundaries. It’s also good practice to demolish the writable data segment (that is, erase its segment descriptor in the GDT) immediately after you’re done initializing the TSS.
The values you put into the eight general purpose registers (EAX, EBX, ECX, etc.) starting at TSS offset 40 (decimal) are entirely up to you. Remember that these values are what the processor will hold when your task first wakes up, so put in something reasonable – where “reasonable” depends on how you’ve written the code for your task. Good programming practice is to never assume the CPU registers hold any particular value and treat them as undefined. Storing zeroes in these registers might be a good idea.
The exception is your stack pointer (ESP, at TSS offset 56), which needs to hold a rational value for your stack. Stuffing a zero here is a bad idea, since stacks push downwards.
The processor flags register (EFLAGS), instruction pointer (EIP), and the six segment registers (CS, DS, etc.) all need to hold sensible values before you start. Be sure to pre-stuff the TSS with a workable value for the processor flags, and preload EIP with the address of the first instruction you want your task to execute when it’s awakened. This can be anywhere within the task’s code segment; it doesn’t need to be the first actual instruction.
Pay attention to the privilege level of your six segment selectors, especially those for the code segment (CS) and your stack segment (SS). These last two privilege levels must match, and the others should probably all be the same, although it’s possible to experiment with other combinations if you want to get tricky.
Register CR3 (the Page Directory Base Register, or PDBR, at TSS offset 28) is optional. It’s necessary only when you have memory paging enabled, which is entirely separate from the x86 architecture’s memory segmentation. It’s best to save paging for another project, so the value here is irrelevant.
Same goes for the LDT Segment Selector (TSS offset 96). If this task will have its own Local Descriptor Table, you’ll need to preload this value with its selector. Let’s assume we’re not using an LDT and stuff this with zeroes. Note that you can’t put any random value in here, nor should you leave it undefined. The processor will assume this is a valid selector and try to figure out which LDT it points to, so be sure it’s either zero (the null selector) or a valid LDT value. Don’t leave it to chance.
The Previous Task Link field (also called the Back Link, at TSS offset 0) is the processor’s way of maintaining a linked list of call/return tasks. If you don’t plan to nest tasks, you can preload this with zeros. Otherwise, it should hold the selector to another TSS, the “parent” of this task.
Nesting tasks is trickier than it sounds because interrupts, faults, and exceptions all work the same as a CALL instruction. They all create nested tasks. So, even though you might not have executed an explicit CALL to another task, a random interrupt might switch tasks anyway, making this task the “child” of the one that was interrupted. The good news is, this won’t happen unless you set up interrupts and exceptions to switch tasks – which we haven’t done here – so it’s not like you’ll be caught by surprise. Just a heads-up for future development.
That leaves just three pairs of registers to define, starting at TSS offset 4. These are a little weird and define the stack segments and stack pointers for privilege levels PL0, PL1, and PL2. The stack for the fourth privilege level was already defined by SS (TSS offset 80) and ESP (TSS offset 56), and this is the stack your task will actually use. The other three are there in case your task changes privilege level while it’s running. Whenever an x86 processor changes privilege levels, it also changes stacks to avoid cross-contamination between levels. Where does it get the new stack from? From these three pairs of stack pointers in the TSS.
If you know for certain that no privilege changes will ever occur within this task, you could leave these six values blank, but that’s risky. Better to define legitimate stack pointers for all three privilege levels, just in case. All three stacks can be very small.
Finally, unlike most of the other registers in the TSS, these six registers will never be updated by the processor when your task is switched out. They’re purely initial values, not current values. The current stack segment and pointer are always at SS:ESP.
Anything else? Clear the T bit (TSS offset 100) to zero unless you want to single-step your task. Right next to it at TSS offset 102 is the I/O Map Base Address, which you probably don’t care about. This allows you fine-grained control of individual I/O addresses, but if you’re not using it, how do you turn it off? Preloading this pointer with zeros just tells the processor that the permission table starts at TSS offset 0, which isn’t what you want. Instead, preload the pointer with a 16-bit offset that’s beyond the end of the TSS segment. For example, if your TSS is the standard 104 bytes in length (according to the TSS descriptor that you’re about to create, below), then set this to 110, or 256, or some other number greater than 104. This impossible offset tells the processor that there is no I/O permission table at all.
That about covers it! Now do it all over again for the second task. The values you preload into the second TSS might be completely different, or they might be mostly the same. It’s perfectly okay for two tasks to share some data space, for instance, so they might have the same DS, ES, FS, and/or GS values. They could even share the same code segment (CS) but have different instruction pointers (EIP). In essence, the two tasks would be running different parts of the same code. You’ll definitely want different stack pointers (SS:ESP) for your two tasks, but the remaining TSS values could be the same.
Setting Up the TSS Descriptors
Each TSS needs its own TSS descriptor, and those two descriptors both need to be in the GDT (Global Descriptor Table). TSS descriptors were described in the previous installment and basically point to the 32-bit base address of the TSS, define its length, and include the usual assortment of control/status bits.
TSS descriptors are not data segments, and they’re not writable, so you’ll need to define a data segment that covers the same address range as each TSS, as described above. Once you’ve initialized each TSS, you can eliminate its corresponding data segment if you want to. (For debugging purposes, however, it’s best to leave these data-segment “aliases” in place so you can examine and manipulate the TSS contents.)
There can be only one TSS descriptor per TSS. That’s different from other types of segments, where it’s okay to have multiple descriptors in the GDT that point to the same data, code, or stack segment. Don’t ever make two TSS descriptors that point to the same TSS. Why? Because it confuses the hardware. Among other reasons, the processor maintains a Busy Flag at offset 41 (decimal) in the TSS descriptor. It sets and resets this bit automatically every time a task is switched in or out, and it won’t switch to a task that’s already busy. If there’s more than one TSS descriptor for that TSS, there’d be more than one Busy Flag. Not good.
One More Thing…
Now that you’ve set up two task state segments and two TSS descriptors, you’re almost ready to go. But we have a chicken-and-egg problem. The magic of x86 task switching is that the hardware automatically manages everything for you, saving the state of the outgoing task and loading the state of the incoming task. This is always done in pairs: one state goes from the processor into the TSS, and one set comes from the TSS into the processor.
But what about the very first task switch? We know where the incoming state is coming from, but where does the outgoing state go to? We probably don’t want the processor’s current state to overwrite the second TSS we just created. But the CPU has to dump its state somewhere.
So, here’s the trick. Make one “dummy” TSS to kick things off, for a total of three task state segments. This one can be completely uninitialized because we’re never going to load anything from it, just dump the current processor state into it and never come back. It’s just 104 bytes, so no big deal. Be sure to make a TSS descriptor for this TSS, too.
Finally, load the 16-bit selector for that dummy TSS descriptor into the processor’s Task Register (TR) using the LTR instruction. This tells the processor what “task” you’re currently running, even though it’s not a bona fide task at all. It just tells the processor which TSS it can use to dump the current state into. Without a legitimate TSS selector in TR, you’ll get a fault on the first task switch. Oh, and make sure the Nested Task flag (NT, bit 14) in the processor’s EFLAGS register is cleared, or the CPU will think you’re returning from a nested task.
And… JMP!