feature article
Subscribe Now

Security Flaw Afflicts Intel x86 Boot ROMs

CVE-2020-8705 Undermines Intel’s Boot Guard Security

Another year, another Intel bug. 

What makes this bug interesting is that it’s so crafty. It’s fun to see what loopholes hardware and software can discover all by themselves. If you’re a homeowner, you know that water is amazingly devious. It can intrude anywhere, including moving uphill, through tiny holes, and along porous surfaces. Stopping leaks – or finding bugs – feels like a never-ending chore. 

In this case, an Intel bug fix introduced a new bug. Ironic? Sure, but, to be fair, Intel’s x86 processors are insanely complex while also being wildly popular. That’s a combination guaranteed to uncover flaws.

The bug in question is officially being called CVE-2020-8705 and it involves corrupted boot ROMs. We’ve all become familiar with the concept of a root of trust as a way to ensure that everything in your software stack, from the boot ROM all the way up to your drivers and applications, are all legitimate. The process usually involves each layer of software vouching for the security of the next layer. Your hardware checks that the firmware is okay before executing it, then the firmware checks the operating system, which then checks its drivers, and so on. 

We’re also all familiar with the concept of race conditions. Unwanted race conditions usually happen in hardware, but there can be software races, too. Maybe a software semaphore is set by one process at the very moment that another process is checking it, so the two wind up disagreeing on the status. 

CVE-2020-8705 combines the root of trust with race conditions to uncover a situation where evildoers can corrupt your boot ROM and unlock secrets you thought were safely hidden away. The good news is, exploiting the bug requires physical access to the hardware. It can’t be attacked remotely. The bad news is, it affects nearly all Intel x86-based machines, regardless of operating system. It’s not specific to Windows or Linux or MacOS. It’s baked into the silicon, and it’s hard – perhaps even impossible – to fix. 

Here’s what happens. All x86 processors have a sleep mode. In fact, they have several. A commonly used sleep mode is S3, and it’s usually activated when you close the lid on your laptop or when a software timer triggers sleep after a period of inactivity. In S3 sleep, the processor is completely dead but the system DRAM stays alive. All CPU state is lost, but all memory contents are preserved. The idea is to save power while also enabling a quick restart when you open your laptop, wiggle your mouse, or tap your screen. Rather than doing a complete cold reboot, the machine simply wakes the processor and resumes where it left off. 

But because the processor was shut down, it needs to boot up and be reinitialized. The process is about 90% the same as for a cold boot, but without disturbing the contents of RAM. Most, but not all, of the code path is the same. Resuming from sleep is faster – that’s the whole point – because it skips a few steps. In particular, a cold boot will verify that the boot ROM is valid, but a wake from sleep won’t, on the assumption that the boot ROM hasn’t changed since the computer went to sleep. 

But what if it has changed? 

Security researcher Trammell Hudson discovered that UEFI boot ROMs don’t perform a root-of-trust check when they awaken from sleep mode. After all, the computer was never turned off, so how could the ROM have changed? Even though it’s possible (common, in fact) to update the flash BIOS while the computer is powered-up, the update process checks the validity of the new firmware and re-establishes the root of trust. But if there hasn’t been a reflash, and there hasn’t been a cold shutdown, the firmware must still be valid, right? 

Not if you have a chip clip and a malicious mind. Trammell showed that it’s possible to reflash the boot ROM while the system is in sleep mode. When the machine reawakens, it bypasses all the usual security checks and runs whatever new firmware you’ve loaded. And, since system RAM is undisturbed, the first thing your malicious code might want to do is to scan memory for interesting information, such as decryption keys. Windows and Linux both store their disk-encryption keys in RAM, which means they’re preserved during sleep mode and therefore accessible to malicious firmware. 

Hacking a machine in this way does require physical access to the boot ROM, but that’s not difficult with most laptops, and it’s even easier with desktop or server enclosures. If you can reach the boot ROM’s SPI interface pins, you’re in. Microsoft’s Surface Pro tablets and other tightly packed consumer systems might be immune simply because they’re impossible to open. Or, at least, they can’t be opened and then closed back up again, so don’t plan on being sneaky. 

Trammell describes this as “a classic time-of-check / time-of-use (TOCTOU) error” It’s a slow-motion race condition between when the boot ROM is verified (at cold boot) and when that supposedly secure code is executed (while exiting sleep mode). There’s no way to guarantee that the code you’re executing is the same code you just checked. There’s always a finite time delay between the two events. 

Sadly, there would have been a trivially easy fix for this, but it seems to have fallen victim to corporate politics and marketing. Systems that use Intel’s platform controller hub (PCH) chip have one-time programmable fuses specifically intended to enable/disable speedy resume-from-sleep restarts. Leaving the fuses intact forces the processor to do a complete reboot, including all security checks, when it resumes from S3 sleep mode. Blowing the fuses enables the quicker restart but bypasses the security checks. Guess which mode nearly every hardware OEM uses? 

In part that’s because Microsoft encourages it. The company designed Windows 10 to restart quickly, in part to better compete with new mobile operating systems like iOS and Android that don’t suffer from Windows’s lengthy and convoluted boot-up procedure. Windows OEMs are strongly encouraged to keep restart times under 2.0 seconds, and that pretty much requires warm-start shortcuts. Which means blowing the PCH fuses, which means the option is irreversible. 

What’s the fix? If you’re not using Intel’s PCH support chip, you should be able to enable full security checks when exiting S3 sleep mode. That way, your boot ROM is checked every time it’s used, not just on cold starts. Warm starts will take longer, but they’ll be more secure. 

If you are using Intel’s PCH, the next best option is to deploy Intel’s firmware workaround. The company isn’t saying exactly what the newer version does, except to suggest that it improves security. Trammell theorizes that it ignores the one-time programmable fuses in the PCH and uses its own switch instead. 

Ominously, he also points out that “This workaround also points to a weakness in the OTP fuses.” So, the fix for the fix may need a fix. The root of trust starts a very long chain, indeed. 

One thought on “Security Flaw Afflicts Intel x86 Boot ROMs”

Leave a Reply

featured blogs
Apr 11, 2021
https://youtu.be/D29rGqkkf80 Made in "Hawaii" (camera Ziyue Zhang) Monday: Dynamic Duo 2: The Sequel Tuesday: Gall's Law and Big Ball of Mud Wednesday: Benedict Evans on Tech in 2021... [[ Click on the title to access the full blog on the Cadence Community sit...
Apr 8, 2021
We all know the widespread havoc that Covid-19 wreaked in 2020. While the electronics industry in general, and connectors in particular, took an initial hit, the industry rebounded in the second half of 2020 and is rolling into 2021. Travel came to an almost stand-still in 20...
Apr 7, 2021
We explore how EDA tools enable hyper-convergent IC designs, supporting the PPA and yield targets required by advanced 3DICs and SoCs used in AI and HPC. The post Why Hyper-Convergent Chip Designs Call for a New Approach to Circuit Simulation appeared first on From Silicon T...
Apr 5, 2021
Back in November 2019, just a few short months before we all began an enforced… The post Collaboration and innovation thrive on diversity appeared first on Design with Calibre....

featured video

Meeting Cloud Data Bandwidth Requirements with HPC IP

Sponsored by Synopsys

As people continue to work remotely, demands on cloud data centers have never been higher. Chip designers for high-performance computing (HPC) SoCs are looking to new and innovative IP to meet their bandwidth, capacity, and security needs.

Click here for more information

featured paper

Understanding Functional Safety FIT Base Failure Rate Estimates per IEC 62380 and SN 29500

Sponsored by Texas Instruments

Functional safety standards such as IEC 61508 and ISO 26262 require semiconductor device manufacturers to address both systematic and random hardware failures. Base failure rates (BFR) quantify the intrinsic reliability of the semiconductor component while operating under normal environmental conditions. Download our white paper which focuses on two widely accepted techniques to estimate the BFR for semiconductor components; estimates per IEC Technical Report 62380 and SN 29500 respectively.

Click here to download the whitepaper

Featured Chalk Talk

Electronic Fuses (eFuses)

Sponsored by Mouser Electronics and ON Semiconductor

Today’s advanced designs demand advanced circuit protection. The days of replacing old-school fuses are long gone, and we need solutions that provide more robust protection and improved failure modes. In this episode of Chalk Talk, Amelia Dalton chats with Pramit Nandy of ON Semiconductor about the latest advances in electronic fuses, and how they can protect against overcurrent, thermal, and overvoltage.

More information about ON Semiconductor Electronic Fuses