At some point, the house of cards begins to topple over. It’s no secret that the x86 processor architecture is almost aged enough to collect a pension. It is, not to put too fine a point on it, a creaking old bit of wheezing ironmongery that, had the gods of microprocessor architecture been more generous, would have been smote into oblivion long ago.
Intel and AMD have done an amazing job keeping the old girl running for all these decades, but the strain is beginning to tell. Programming x86 chips is hard and it’s frustrating, because new features keep piling on while the old ones never go away. It’s like operating a rocket ship using tiller, sails, and hawser. It’s a miracle it works at all.
So, it’s no surprise that Intel and AMD – to say nothing of the thousands of x86 programmers around the world – might be eager to cast off some of that grating machinery and replace it with something less… quaint.
Case in point: interrupt handling. Ever since the days of the ’286 (circa MCMLXXXII) the world’s x86 processors have all used a Byzantine system to manage interrupts, faults, and exceptions that combined lookup tables, privilege checks, bounds juggling, context preservation, and memory management that confounded many a programmer – and yet somehow worked. Most of the time.
Not surprisingly, there were some bugs in the interrupt descriptor table (IDT) system, and they tended to be fiendishly subtle. Race conditions, lockups, infinite loops, and privilege violations were generally rare but unavoidable. Not every operating system has the Blue Screen of Death, but they’ll have something similar. OS programmers have spent a lot of time sussing out and patching the arcane and abstruse details of x86 interrupt handling.
Time for a reboot? Maybe. Intel and AMD have independently come up with schemes to simplify the way x86 processors handle faults, traps, and interrupts. Naturally, the two approaches are different and incompatible with each other. Intel’s is more ambitious, but it also represents more work for programmers hoping to take advantage of the new mechanism. AMD’s approach is simpler; it’s more of a patch than an overhaul.
Intel’s FRED Does Away with the IDT
For the first time in a long time, Intel actually wants to jettison, not just tweak, one of its major features. The proposed FRED (Flexible Return and Event Delivery) system would entirely replace the IDT, along with its interrupt descriptors. It also effectively reduces the number of privilege levels from four to just two. Finally, call gates become a thing of the past. In return (so to speak), interrupt handling would be faster, simpler, more complete, and less prone to corner-case bugs. For many of us, it will be the first rework of x86 interrupt handling in our lifetimes.
FRED will be optional, so the nostalgic can still use the current IDT approach, with or without call gates. You’ll be able to switch FRED on and off at boot time, so older and newer operating systems can coexist on the same silicon. With FRED enabled, the processor completely ignores and bypasses the IDT. Or, more accurately, you won’t have to create an IDT or interrupt descriptors to begin with.
Interrupts, traps, and exceptions are collectively called “events,” and routing those events to their appropriate handler becomes “event delivery.” All event handlers (that is, interrupt handlers, fault handlers, exception handlers, etc.) run at privilege level 0, the most privileged and innermost of the four privilege “rings.” Rather than use the IDT to locate the entry point of each handler, processor hardware will simply calculate an offset from a fixed base address. It’s less flexible than the IDT system, but it’s simpler to set up and quicker for the hardware to implement.
Each handler will automatically have not one, but two, entry points exactly 64 bytes apart. The first (lowest offset address) is for when the handler was called from unprivileged code. The other is for when the handler was called from a privileged CPL0 code. The split allows you to write two exit routines, depending on whether you’re returning to privileged or unprivileged code. Simple, but effective.
Why only two privilege levels? FRED collapses the x86 family’s four privilege rings into just two: user and supervisor. That’s a big step back from when the four-level scheme was introduced in the 1980s, but the intervening decades have shown that few programmers ever used all four levels. Most just separated privileged from nonprivileged, so Intel took the opportunity with FRED to sanctify what everyone was doing anyway.
The user/supervisor distinction is important because it determines whether you change stacks or not. This was always a weak point of x86 interrupt handling, and it got even weirder with 64-bit extensions and operating systems. The processor nominally maintained four separate stacks (one for each privilege level), plus a possible “shadow stack” for the operating system or hypervisor. It got tricky to know which stack to use, which to preserve, and which to never, ever touch.
By convention, 64-bit operating systems also used the GS segment register to maintain thread integrity, but this wasn’t managed by the hardware. Interrupt handlers had to juggle GS selectors and memory segments, hoping to preserve the incoming context, and there are no atomic operations to do this.
FRED will do all of that automatically. It pushes 40 bytes onto the privileged stack of the event handler, along with another 64 bytes of event information (essentially a mini-core dump) elsewhere. By the time it’s called, the event handler should have all the context it needs to get directly to work.
Intel’s documentation says, “The principal functionality of FRED event delivery is to establish a new context, that of the event handler in ring 0, while saving the old context for a subsequent return. Some parts of the new context have fixed values, while others depend on the old context, the nature of the event being delivered, and software configuration.”
Returning from the event handler will use two new instructions, ERETU and ERETS (Event Return to User/Supervisor). The user-mode version restores all the nonprivileged context, including stack, instruction pointer, registers, and so on. The supervisor version is simpler and therefore quicker because it’s not crossing privilege-level boundaries.
Similarly, the SYSCALL and SYSENTER instructions change behavior. They still enable system-level function calls but no longer vector through the (nonexistent) IDT.
Another big change is to call gates. Forget ’em. The IDT could be populated with call gates, which both defined the interrupt handler’s entry point and its privilege level. Now, all event handlers have predefined entry points, and they all run at the highest privilege level. Ergo, no need for call gates in the IDT.
What about call gates elsewhere, like in the GDT or LDT? Still no. Says Intel, “When FRED transitions are enabled, any execution of far CALL or far JMP that references a call gate causes a general-protection exception (#GP).” In other words, you can still call/jump to code in other segments, but it must be at the same privilege level.
Call gates used to be a common – and officially accepted – way to elevate privilege, but no more. FRED doesn’t allow privilege changes (“ring crossings”) except through its own mechanism, which means a deliberate interrupt or exception.
This raises an interesting side effect, which I’m sure was deliberate. If all privilege changes have to go though FRED events, and if FRED recognizes only two privilege levels, there’s no longer any way to reach code at privilege levels 1 or 2. Those two intermediate privilege levels have effectively been banished and rendered unusable. Even if you somehow manage to start out running at CPL1 or CPL2, the very first interrupt or exception will kick you up to a CPL0 event handler, which will then return you to CPL3 when it exits. Yup, I’d say those two extra privilege levels are processora non grata.
AMD’s SEE
In contrast to Intel’s interrupt overhaul, AMD’s approach is just a light spit and polish. The company’s proposed Supervisor Entry Extensions (SEE) retain the familiar IDT, call gates, and four privilege rings. It tweaks the existing SYSCALL instruction, adds a status bit to mark interrupt handlers as reentrant, and makes nonmaskable interrupts maskable(!).
SEE seeks merely to patch up some of the loopholes that have been uncovered by years of software development “to properly deal with circumstances that in practice are rare but cannot be ignored.” One is the problem of reentrant fault handlers: they aren’t. It’s all too easy to get a fault or exception, jump to the relevant fault handler (through the IDT) and quickly get a second identical fault before you’ve properly saved enough context from the first one. With just a bit of bad luck, it’s possible to lose context, or to create an infinite loop if you’re in the midst of setting up stack pointers.
SEE fixes this by allocating a bit in each interrupt descriptor to say, “this handler is not reentrant.” When set, the processor hardware will set another “busy” bit any time this handler is called and it won’t call the handler a second time until it’s cleared. The busy bit gets cleared automatically upon exit from the handler or, if you’re brave, you can clear it earlier in software.
This also works for nonmaskable interrupts (NMI), which seems recklessly dangerous, not to mention oxymoronic. If the NMI handler is active, the processor will postpone, but not ignore, further NMI interrupts until such time as the handler is prepared to, well, handle them.
Linus Offers an Opinion
Both proposals are brand new, and both went relatively unnoticed until a certain Linus Torvalds weighed in with a characteristically colorful opinion. His view? They’re both good ideas. Why not implement them both? Although Intel and AMD took completely separate approaches, the two are not mutually exclusive. In the worst case, you could imagine implementing them both and then enabling one or the other (or neither) at boot time. What’s one more configuration option?
It’s not often that we see a major change to the x86 processor family. Maybe it’s long overdue. But changing the wings on a jetliner in midflight (to borrow an old analogy) is tricky business. The x86 family has survived in large part due to its backward compatibility. Both companies realize this; these proposals are options, not mandatory replacements. And they should affect only OS kernel code, not applications or drivers, so they’ll be invisible to almost everyone. It’s possible that we’ll never see either one. Or we might look forward to ever-so-slightly less buggy interrupt handling.
By Intel dropping the various privilege levels, when anyone living in the present would possibly increase the number them…think various level of both software and hardware abstraction levels, goes to show that Intel is still living in the past and their future in tech as a whole is very much in question now; at least in my mind! 🙁 It saddens me to watch such an industry pioneer, one that I looked up to, fall to the way side of irrelevance! 🙁
I agree with with Linus that both ideas are great ideas, but why force their use to be decided at boot time and cause a TON of software (think device drivers, especially) to need to be re-written! There needs to be a better implementation approach on both ideas by not being limited to a boot time decision. Certainly state management can be handled in a nice hardware, software or hybrid solution.