feature article
Subscribe Now

Spectre Bug Rears Its Head Again

Academic Paper Outlines Another Variation of the CPU Security Flaw

Boo! A scary new variation of the Spectre CPU bug has surfaced, and it may be resistant to the fixes and countermeasures already deployed. Or maybe not. 

A band of CS/EE students has published a paper, provocatively dubbed “I See Dead µops: Leaking Secrets via Intel/AMD Micro-Op Caches,” claiming to reveal another new way to siphon sensitive information out of x86 processors. It looks like yet another variation of the well-known Spectre bug, but in addition to detailing the discovery itself, the paper also reports that this version isn’t fixable. 

There are now so many variations of Spectre that it’s become its own brand. Several fixes and countermeasures have been deployed since it was first discovered three years ago, but this new strain is fix-resistant – at least, according to its discoverers. 

Intel disagrees, and says that its currently published guidelines will squash it very nicely, thank you very much. AMD agrees. Both companies are saying, in essence, go back to your homes, programmers, there’s nothing to see here. 

First, some background. As with all the previous versions of Spectre, this one depends on extremely subtle side effects in the way that microprocessors fetch and execute software instructions. Note that Spectre is not an Intel bug or an AMD bug. It’s not even an x86 bug. It’s a complexity bug. Spectre has affected nearly all modern high-end microprocessors, including those designed by ARM. Only this latest version is x86-specific. 

Such chips all perform speculative execution, meaning they sometimes guess what they’re supposed to do next. When they guess right – and most of the time, they do – they save a bunch of time. When they occasionally guess wrong, they discard the incorrect results and start over with the right instructions. All of this is invisible to software, even to those of us who dabble in assembly-language programming. 

In the initial versions of Spectre, attackers leveraged the effect that speculative execution had on the chip’s data caches. This new one instead exploits the chip’s micro-operations cache, an even more deeply hidden feature that helps speed up instruction decoding. Today’s x86 processors don’t actually execute x86 instructions natively. Instead, each x86 instruction is decomposed into simpler micro-operations (µops) that look like generic RISC processor instructions. This translation is deliberately hidden from us, but some details of its operation can be inferred. 

Some x86 instructions are pretty simple, while some are insanely complex. That means some x86 instructions translate into just one µop, some translate into a few µops, and some translate into a whole bunch of µops. Not surprisingly, cracking apart the complex instructions takes longer than translating the simple ones. To speed things up, those translations are kept in a µop cache, which can hold a few hundred to a few thousand µops, depending on the processor. Chipmakers tend to be pretty secretive about the whole translation process and the details of the µop cache. 

But we do know that the µop cache is probabilistic: sometimes it holds the information you want, and sometimes it doesn’t. You can’t ever be sure. If the correct x86-to-µop translation data is in the cache, great! If not, no big deal, the chip will look it up. At most, you’ve lost a few clock cycles. But it is precisely this timing difference that the researchers’ new variation exploits. 

We also know that, like all caches, the µop cache is organized into cache lines, sets, and ways, and that it uses an associative mapping mechanism. All of this means that you could – with a lot of effort – organize instructions in memory in a way that perfectly matches the µop cache’s internal organization, and all of that data would be cached. You could also do the opposite and create a perfect cache-thrasher. After a lot of experimentation, the university researchers created code streams that are ideal examples of both. 

Using fine-grained performance monitors, they were able to measure the time difference between code that was in the µop cache and code that wasn’t. “This allows us to obtain a clearly distinguishable binary signal (i.e., hit vs. miss) with a mean timing difference of 218.4 cycles and a standard deviation of 27.8 cycles, allowing us to reliably transmit a bit (i.e., one-bit vs. zero-bit) over the micro-op cache.” 

That’s swell, but how is it useful? And how does it constitute a security threat? You can use those timing differences to leak information from one program to another, even when the two are supposedly independent of each other. Like previous Spectre exploits, this requires two programs working in parallel. One runs the deliberate cache-thrasher loop (while timing it), while the other runs the same loop. The two programs will interfere with each other to keep the µop cache spilling and refilling, which the first program can detect by timing. Alternatively, the second program can run a deliberately benign loop that won’t thrash the µop cache, which the first program can also detect. In this way, the two programs can communicate one bit at a time. Thrashing (slower execution) is a 1, and not thrashing (faster execution) is a 0. 

Tedious, but effective. Even leaking just one bit at a time, that’s still a lot of data at GHz clock speeds. The team also found that they don’t have to manipulate the entire µop cache to make this work. A subset works, too. “We reach our best bandwidth (965.59 Kbps) and error rates (0.22%) when six ways of eight sets are probed, while limiting ourselves to just five samples. We further report an error-corrected bandwidth by encoding our transmitted data with Reed-Solomon encoding that inflates file size by roughly 20%, providing a bandwidth of 785.56 Kbps with no errors.” That’s close to a Mbit/sec of uncompressed data. Yikes. 

As with previous Spectre variations, this one doesn’t tell you how to get the exploit into a target computer, only how to exfiltrate data out once it’s there. 

The research paper concludes with some suggested countermeasures, including flushing the µop cache at frequent intervals. That works, but it also harms performance. It’s also not something most operating systems or hypervisors are designed to do. And it leaves the security up to software, which might itself be compromised. Finally, there’s no “right answer” as to how frequently you should flush the µop cache. More is better, but how much is enough? 

Intel and AMD both say that you don’t have to do any of this – that it’s a solved problem. In similar official statements, both companies said that their existing guidelines for mitigating side-channel attacks will work in this case, too. Those guidelines rely on constant-time coding, which is pretty much what it sounds like. Instead of writing functions to operate as quickly as possible, you write them to run for a fixed amount of time. That’s harder than it sounds, because it’s counterintuitive for most programmers and because it takes practice to do properly. Experts in cryptography – and that’s essentially what this is, cryptography – often rely on constant-time code because it hides internal shortcuts that can reveal secrets. Since Spectre is a timing-based attack, those countermeasures should work here, too. 

I applaud the university team for discovering this weakness, and for putting in the hours it must have taken to characterize and document. The paper comes across as a little hysterical for an academic paper, however, almost as if it were calculated to capture headlines (which it did). Yes, we have a bug in a large number of high-performance processors. No, we’re not all in danger of immediate computer meltdown. 

At this level of complexity, there will always be bugs. (Heck, I’ve even wired flip-flops wrong.) The important thing is to always keep looking for them, always apply reasonable countermeasures, and always assume there’s another one over the horizon.

One thought on “Spectre Bug Rears Its Head Again”

  1. think about all the past actions predicated on the supposition that no such thing existed….always better to know

    as the wise man said, ” the truth hurts, but the lies’ll kill you “

Leave a Reply

featured blogs
Jun 18, 2021
It's a short week here at Cadence CFD as we celebrate the Juneteenth holiday today. But CFD doesn't take time off as evidenced by the latest round-up of CFD news. There are several really... [[ Click on the title to access the full blog on the Cadence Community sit...
Jun 17, 2021
Learn how cloud-based SoC design and functional verification systems such as ZeBu Cloud accelerate networking SoC readiness across both hardware & software. The post The Quest for the Most Advanced Networking SoC: Achieving Breakthrough Verification Efficiency with Clou...
Jun 17, 2021
In today’s blog episode, we would like to introduce our newest White Paper: “System and Component qualifications of VPX solutions, Create a novel, low-cost, easy to build, high reliability test platform for VPX modules“. Over the past year, Samtec has worked...
Jun 14, 2021
By John Ferguson, Omar ElSewefy, Nermeen Hossam, Basma Serry We're all fascinated by light. Light… The post Shining a light on silicon photonics verification appeared first on Design with Calibre....

featured video

Reduce Analog and Mixed-Signal Design Risk with a Unified Design and Simulation Solution

Sponsored by Cadence Design Systems

Learn how you can reduce your cost and risk with the Virtuoso and Spectre unified analog and mixed-signal design and simulation solution, offering accuracy, capacity, and high performance.

Click here for more information about Spectre FX Simulator

featured paper

Make wearable and IOT audio effortless with a plug'n'play class D amplifier

Sponsored by Maxim Integrated

A power-hungry display is not the most efficient medium for interfacing with battery-powered portable, wearable, and IoT devices. For this reason, low-power audio is fast becoming a more popular alternative. In this design solution, Maxim Integrated reviews the class D digital audio amplifier and discusses the constraints of some current solutions before presenting a cleverly packaged IC that requires minimal configuration to quickly bring high-quality audio to these applications.

Click to read more

featured chalk talk

Minitek Microspace

Sponsored by Mouser Electronics and Amphenol ICC

With the incredible pace of automotive innovation these days, it’s important to choose the right connectors for the job. With everything from high-speed data to lighting, connectors have a huge impact on reliability, cost, and design. In this episode of Chalk Talk, Amelia Dalton chats with Glenn Heath from Amphenol ICC about the Minitek MicroSpace line of automotive- and industrial-grade connectors.

Click here for more information about Amphenol FCI Minitek MicroSpace™ Connector System