Let me sail, let me sail
Let the Orinoco flow
Let me reach, let me beach
On the shores of Tripoli
Let me sail, let me sail
Let me crash upon your shore
Let me reach, let me beach
Far beyond the Yellow Sea – Enya, Orinoco Flow
The disaggregation of the semiconductor industry and the rise of outsourcing chip manufacturing to IC foundries has now stretched the silicon supply chain around the planet. Like any chain, the semiconductor supply chain is only as strong as its weakest link. Lately, the Covid-19 pandemic has exposed some weak links in the supply chain that have broken and produced severe chip shortages in the face of demand that spiked to five times the normal rate of increase. As a result, we’re seeing a building boom in IC fabs unlike anything seen since the 1960s. Researchers at Tallinn University of Technology in Tallinn, Estonia have now proven the possibility of another weak link in this chain: the ability to introduce hardware-based Trojan spyware IP into the mask sets of finished IC designs using established methods for fixing post-layout bugs – specifically, ECO (engineering change order) flows. Although the problem of hardware Trojan IP has been known to be theoretically possible for a while, it’s now been demonstrated in the most expedient way: by inserting operational Trojan IP into some actual chips using an ECO flow and testing that IP to show that it can leak data to the outside world.
The research assumes that someone in a foundry can be bribed or coerced into inserting a hardware Trojan into a design. The specific method used for this insertion is a standard ECO flow that’s needed to make small post-layout bug fixes or enhancements to an IC’s mask set. The researchers’ primary goal was to develop and demonstrate ways to extract bits from inside of the chip and to leak those bits to the outside world. The communication path to the outside world cannot add pins to the finished device because that sort of tampering would be quite obvious. Instead, the researchers’ assumption is that communications with the world outside of the IC will take place over a side channel. This type of spying is called a side-channel attack (SCA).
One heavily researched SCA method employs power modulation to communicate with the outside world via the IC’s power pins. One method for modulating an IC’s power consumption employs on-chip circuitry that consumes various amounts of additional power when activated. The researchers refer to this sort of attack scheme as a “Malicious Off-chip Leakage Enabled by Side channels” (MOLES) attack. The researchers focused on developing hardware Trojans to perform MOLES attacks, and the targets for these attacks are on-chip cryptographic keys.
For the purposes of this project, the researchers assumed that a rogue element inside the foundry will be able to insert malicious logic in a finalized layout. Because the attacker works inside the foundry, the researchers assume they will have access to the technology, tools, and cell libraries used to create the original IC layout. This assumption may or may not be true.
Further, the researchers assume the attacker can identify a cryptographic core in the IC’s layout. Although the researchers state their belief that this is a reasonable assumption, especially for well-known AES core implementations, I remain skeptical. Nevertheless, it may be possible for automated tools to identify such structures, and it may be quite easy to identify on-chip key-storage locations, depending on how the keys are stored. Attaching probes that link the Trojan IP to a stored, on-chip cryptographic key might not be that hard, although I am aware of certain types of key storage that are specifically designed to be virtually undetectable.
The researchers do not assume that the rogue element has any knowledge of the IC’s clock domains or clock distribution networks. However, the hardware Trojan IP employed by the researchers does require a clock.
Researchers assume the attack takes place after the foundry receives the victim IC’s GDSII layout. Commercial EDA tools have the capability to modify finalized designs through ECOs, which can be used to modify or insert additional logic in a finalized layout. All foundries have this capability.
An ECO flow requires four inputs:
- A technology library,
- A cell library
- The gate-level netlist
- Timing constraints
Researchers assume that a properly placed rogue element working a foundry will already possess the first two of these elements. They must generate the third element and estimate the fourth. It’s possible to extract a gate-level netlist from an IC’s layout using appropriate EDA tools. Timing constraints can be estimated with sufficient precision to make this sort of exploit work.
The researchers’ side-channel Trojan (SCT) IP leaks information from the target chip by creating artificial, switchable power consumption. One such SCT IP design employed a controlled power sink operating at a controlled frequency. A ring oscillator with dynamically controlled delay stages drives the power sink. The bits of the cryptographic key then serve as the control bits that turn the delay stages on and off to modulate the IC’s dynamic power consumption. Consequently, the bits leak out through the IC’s power pins as small but detectable variations in the IC’s supply current. Researchers found that they could detect and decode bits when leaked using only a few microwatts of power modulation and that this method was not sensitive to IC process variation.
The SCT IP used in this research project switches on only when the on-chip cryptographic block is not performing its function. Consequently, this sort of SCT design requires that the rogue element locate a status bit that signals when cryptographic encoding or decoding are occurring. Again, the researchers assume that the rogue element will be skilled enough to recognize the cryptographic block and will be able to locate an appropriate status bit. Again, I remain skeptical. Nevertheless, the danger is quite real. I am not skeptical of that.
Two very important points jump out at me from this research. First, the researchers found ample room to place their SCT IP within finished layouts, even on crowded designs. Apparently, place-and-route tools routinely insert empty filler cells into IC layouts to improve overall chip routability. These filler regions are simply used for metal routing in the original layout, so they are ideal locations for installing hardware Trojan IP, which can then usurp unused metal layers in the region for its own interconnect.
The second significant point is that the researchers needed a bit more than an hour to insert an SCT IP core into a target IC layout, supposedly knowing nothing about the original layout. A rogue element could potentially insert a piece of spyware into an IC design during a colleague’s lunch break using these techniques.
I think it’s critically important to step back from this technology discussion to consider the essential element needed to make this scheme work: the rogue element. The real weak point this research identifies is a human being. Humans can be bribed or coerced to perform all sorts of espionage. There’s no use in denying or ignoring this fact. It’s been true for thousands of years. Phishers on the Web routinely exploit people’s foibles to trick them, bribe them, or otherwise coerce them into giving away secrets. This sort of an attack even has a name: social engineering.
No doubt, IC foundries take major precautions to guard their customers’ designs. Everyone in the IC business is aware that espionage occurs. It has routinely occurred since the IC was invented. Espionage is a cat-and-mouse game. For example, FPGA configurations were not encrypted at first, and they were easily reverse engineered or stolen. FPGA vendors then added encryption and authentication to their bit streams. Then side-channel attacks appeared on the scene and FPGA vendors added circuits to deflect those attacks. The espionage game of spy versus spy continues to this day.
ASIC manufacturing is no different. As long as valuable IP is locked up in silicon, there will be people who will try to steal it. In the past, they’ve stolen chip secrets by delidding ICs (heck, I’ve done this myself), stripping them layer by layer with various etching fluids, X-raying them, deconstructing them layer by layer with focused ion-beam micromachining, and using differential power analysis. There’s really no end to the ingenuity of determined thieves and other malicious actors. These people and their organizations are not to be underestimated.
Meanwhile, the big message I’ve gleaned from this research is that you must have a conversation with your foundry, and whomever else you’re trusting with your IP, to fully understand how your crown jewels are being guarded. Worst case, they aren’t.
And finally, please be aware that it’s just as easy, or even easier, to insert a hardware Trojan into a design while the chip is being designed. That’s right. People working for your company, doing the design work, possibly around the globe, are equally susceptible to bribery, coercion, and social engineering. This sort of thing can even happen accidentally, when someone accidentally incorporates a piece of 3rd-party IP that’s not been properly vetted. It’s probably very painful to consider this possibility, but think about it anyway. What policies does your company have in place to safeguard against these sorts of exploits?
- “The threat of Hardware Trojan Horses is bigger than we have thought, new academic research claims,” Tallinn University of Technology, December 20, 2021.
- T. Perez and S. Pagliarini, “Hardware Trojan Insertion in Finalized Layouts: a Silicon Demonstration,” arXiv:2112.02972 [cs], Dec. 2021. https://arxiv.org/abs/2112.02972