feature article
Subscribe Now

Spectre Bug Rears Its Head Again

Academic Paper Outlines Another Variation of the CPU Security Flaw

Boo! A scary new variation of the Spectre CPU bug has surfaced, and it may be resistant to the fixes and countermeasures already deployed. Or maybe not. 

A band of CS/EE students has published a paper, provocatively dubbed “I See Dead µops: Leaking Secrets via Intel/AMD Micro-Op Caches,” claiming to reveal another new way to siphon sensitive information out of x86 processors. It looks like yet another variation of the well-known Spectre bug, but in addition to detailing the discovery itself, the paper also reports that this version isn’t fixable. 

There are now so many variations of Spectre that it’s become its own brand. Several fixes and countermeasures have been deployed since it was first discovered three years ago, but this new strain is fix-resistant – at least, according to its discoverers. 

Intel disagrees, and says that its currently published guidelines will squash it very nicely, thank you very much. AMD agrees. Both companies are saying, in essence, go back to your homes, programmers, there’s nothing to see here. 

First, some background. As with all the previous versions of Spectre, this one depends on extremely subtle side effects in the way that microprocessors fetch and execute software instructions. Note that Spectre is not an Intel bug or an AMD bug. It’s not even an x86 bug. It’s a complexity bug. Spectre has affected nearly all modern high-end microprocessors, including those designed by ARM. Only this latest version is x86-specific. 

Such chips all perform speculative execution, meaning they sometimes guess what they’re supposed to do next. When they guess right – and most of the time, they do – they save a bunch of time. When they occasionally guess wrong, they discard the incorrect results and start over with the right instructions. All of this is invisible to software, even to those of us who dabble in assembly-language programming. 

In the initial versions of Spectre, attackers leveraged the effect that speculative execution had on the chip’s data caches. This new one instead exploits the chip’s micro-operations cache, an even more deeply hidden feature that helps speed up instruction decoding. Today’s x86 processors don’t actually execute x86 instructions natively. Instead, each x86 instruction is decomposed into simpler micro-operations (µops) that look like generic RISC processor instructions. This translation is deliberately hidden from us, but some details of its operation can be inferred. 

Some x86 instructions are pretty simple, while some are insanely complex. That means some x86 instructions translate into just one µop, some translate into a few µops, and some translate into a whole bunch of µops. Not surprisingly, cracking apart the complex instructions takes longer than translating the simple ones. To speed things up, those translations are kept in a µop cache, which can hold a few hundred to a few thousand µops, depending on the processor. Chipmakers tend to be pretty secretive about the whole translation process and the details of the µop cache. 

But we do know that the µop cache is probabilistic: sometimes it holds the information you want, and sometimes it doesn’t. You can’t ever be sure. If the correct x86-to-µop translation data is in the cache, great! If not, no big deal, the chip will look it up. At most, you’ve lost a few clock cycles. But it is precisely this timing difference that the researchers’ new variation exploits. 

We also know that, like all caches, the µop cache is organized into cache lines, sets, and ways, and that it uses an associative mapping mechanism. All of this means that you could – with a lot of effort – organize instructions in memory in a way that perfectly matches the µop cache’s internal organization, and all of that data would be cached. You could also do the opposite and create a perfect cache-thrasher. After a lot of experimentation, the university researchers created code streams that are ideal examples of both. 

Using fine-grained performance monitors, they were able to measure the time difference between code that was in the µop cache and code that wasn’t. “This allows us to obtain a clearly distinguishable binary signal (i.e., hit vs. miss) with a mean timing difference of 218.4 cycles and a standard deviation of 27.8 cycles, allowing us to reliably transmit a bit (i.e., one-bit vs. zero-bit) over the micro-op cache.” 

That’s swell, but how is it useful? And how does it constitute a security threat? You can use those timing differences to leak information from one program to another, even when the two are supposedly independent of each other. Like previous Spectre exploits, this requires two programs working in parallel. One runs the deliberate cache-thrasher loop (while timing it), while the other runs the same loop. The two programs will interfere with each other to keep the µop cache spilling and refilling, which the first program can detect by timing. Alternatively, the second program can run a deliberately benign loop that won’t thrash the µop cache, which the first program can also detect. In this way, the two programs can communicate one bit at a time. Thrashing (slower execution) is a 1, and not thrashing (faster execution) is a 0. 

Tedious, but effective. Even leaking just one bit at a time, that’s still a lot of data at GHz clock speeds. The team also found that they don’t have to manipulate the entire µop cache to make this work. A subset works, too. “We reach our best bandwidth (965.59 Kbps) and error rates (0.22%) when six ways of eight sets are probed, while limiting ourselves to just five samples. We further report an error-corrected bandwidth by encoding our transmitted data with Reed-Solomon encoding that inflates file size by roughly 20%, providing a bandwidth of 785.56 Kbps with no errors.” That’s close to a Mbit/sec of uncompressed data. Yikes. 

As with previous Spectre variations, this one doesn’t tell you how to get the exploit into a target computer, only how to exfiltrate data out once it’s there. 

The research paper concludes with some suggested countermeasures, including flushing the µop cache at frequent intervals. That works, but it also harms performance. It’s also not something most operating systems or hypervisors are designed to do. And it leaves the security up to software, which might itself be compromised. Finally, there’s no “right answer” as to how frequently you should flush the µop cache. More is better, but how much is enough? 

Intel and AMD both say that you don’t have to do any of this – that it’s a solved problem. In similar official statements, both companies said that their existing guidelines for mitigating side-channel attacks will work in this case, too. Those guidelines rely on constant-time coding, which is pretty much what it sounds like. Instead of writing functions to operate as quickly as possible, you write them to run for a fixed amount of time. That’s harder than it sounds, because it’s counterintuitive for most programmers and because it takes practice to do properly. Experts in cryptography – and that’s essentially what this is, cryptography – often rely on constant-time code because it hides internal shortcuts that can reveal secrets. Since Spectre is a timing-based attack, those countermeasures should work here, too. 

I applaud the university team for discovering this weakness, and for putting in the hours it must have taken to characterize and document. The paper comes across as a little hysterical for an academic paper, however, almost as if it were calculated to capture headlines (which it did). Yes, we have a bug in a large number of high-performance processors. No, we’re not all in danger of immediate computer meltdown. 

At this level of complexity, there will always be bugs. (Heck, I’ve even wired flip-flops wrong.) The important thing is to always keep looking for them, always apply reasonable countermeasures, and always assume there’s another one over the horizon.

One thought on “Spectre Bug Rears Its Head Again”

  1. think about all the past actions predicated on the supposition that no such thing existed….always better to know

    as the wise man said, ” the truth hurts, but the lies’ll kill you “

Leave a Reply

featured blogs
Apr 24, 2024
Diversity, equity, and inclusion (DEI) are not just words but values that are exemplified through our culture at Cadence. In the DEI@Cadence blog series, you'll find a community where employees share their perspectives and experiences. By providing a glimpse of their personal...
Apr 23, 2024
We explore Aerospace and Government (A&G) chip design and explain how Silicon Lifecycle Management (SLM) ensures semiconductor reliability for A&G applications.The post SLM Solutions for Mission-Critical Aerospace and Government Chip Designs appeared first on Chip ...
Apr 18, 2024
Are you ready for a revolution in robotic technology (as opposed to a robotic revolution, of course)?...

featured video

How MediaTek Optimizes SI Design with Cadence Optimality Explorer and Clarity 3D Solver

Sponsored by Cadence Design Systems

In the era of 5G/6G communication, signal integrity (SI) design considerations are important in high-speed interface design. MediaTek’s design process usually relies on human intuition, but with Cadence’s Optimality Intelligent System Explorer and Clarity 3D Solver, they’ve increased design productivity by 75X. The Optimality Explorer’s AI technology not only improves productivity, but also provides helpful insights and answers.

Learn how MediaTek uses Cadence tools in SI design

featured paper

Designing Robust 5G Power Amplifiers for the Real World

Sponsored by Keysight

Simulating 5G power amplifier (PA) designs at the component and system levels with authentic modulation and high-fidelity behavioral models increases predictability, lowers risk, and shrinks schedules. Simulation software enables multi-technology layout and multi-domain analysis, evaluating the impacts of 5G PA design choices while delivering accurate results in a single virtual workspace. This application note delves into how authentic modulation enhances predictability and performance in 5G millimeter-wave systems.

Download now to revolutionize your design process.

featured chalk talk

Littelfuse Protection IC (eFuse)
If you are working on an industrial, consumer, or telecom design, protection ICs can offer a variety of valuable benefits including reverse current protection, over temperature protection, short circuit protection, and a whole lot more. In this episode of Chalk Talk, Amelia Dalton and Pete Pytlik from Littelfuse explore the key features of protection ICs, how protection ICs compare to conventional discrete component solutions, and how you can take advantage of Littelfuse protection ICs in your next design.
May 8, 2023
41,502 views