The Cat-and-Mouse World of Logic Camouflage

We all know that security (or at least talking about it) is all the rage, although most of that attention relates to software. But hardware too? Yes, hardware too. However, today, we’re not going to talk about hardware that’s providing security to something else; we want to talk about protecting the hardware itself as intellectual property (IP).

Reverse engineering by companies like Chipworks has been both boon and bane to the semiconductor world. Some companies are notoriously stingy with information about how they did this chip or that; reverse engineering can fill in some of the holes. Information like that, judiciously used, can help the industry to move forward (with the understanding that, just because you now know how they did something, doesn’t mean you can go copy it – there’s that little thing of patents…).

Then again, the flipside of that is that, particularly when other companies – or even nations – aren’t concerned about being sued for patent infringement, they can simply figure out what someone else did and copy it. They might feel invulnerable because they’re big, or maybe they’re selling somewhere where there is no patent, or perhaps they’re a government (or are sponsored by one).

So, this year, we look at logic camouflage, brought to you by a session from last fall’s ICCAD conference. It’s about ways to hide and reveal how a particular design was implemented. It’s based partly on the notion that you can hide true circuits amongst chaff circuits, also referred to as dummy or decoy circuits. Those latter ones don’t do anything, but, in theory, if you do it right, you won’t be able to tell which circuits are functional and which are fake.

The problem is, improvements in so-called satisfiability solvers, or SAT salvers, have made it increasingly easy to go in and figure out the details of a circuit based on how it works rather than how it looks. And so the Tom and Jerry games continue.

There’s a huge constraint, however, on solutions to this issue: cost. Yes, if you can hide a tricky circuit amongst 500 dummy circuits and then somehow thwart the SAT solvers, you may protect your IP. But that’s an enormous amount of expensive silicon being dedicated not to doing something useful that has monetary value to a customer, but to cloaking the true circuit – something that has value mostly to the chip producer.

So, doing large-scale camouflage is expensive, creating the typical activity that practically defines an engineer (as opposed to a scientist): making tradeoffs. Our four stories today are about hiding and revealing circuits in ways that might at first focus on technical viability, but must eventually demonstrate economic viability as well. Details are available in the ICCAD proceedings.

Sequential Sleuthing

Our first story comes from a team from New York University and the University of Waterloo. It focuses directly on this issue of whether SAT solvers can reveal everything about a circuit.

Strictly speaking, SAT solvers illuminate combinatorial circuits. They can, in theory, be used for sequential circuits as well, but that has traditionally required access to the internal states. A major constraint for this reverse-engineering scheme is that you have access only to the so-called primary inputs and outputs: the actual pins on the device. You don’t get to prove the internal state. And that has been a barrier to decoding sequential circuits with SAT solvers.

The classic solution to the sequential problem has been the scan chain. The JTAG (or any other) scan chain allows you to preload a state and then… postunload?… the resulting state after the clock ticks. It breaks the sequential problem into a series of combinatorial problems. The usual defense against scan-chain snooping is to make the scan chain unavailable unless you murmur the appropriate incantations. Now how are you going to do your reverse engineering?

Without the scan chain, you have no access to the internal state, since that state typically won’t be available at a primary output. So you must infer the state based on the primary outputs you do have. For this, they used a bounded model checker. Armed with this, they were able to reveal a 5000-gate portion of a Viper processor netlist. Some other blocks achieved near, if not total, success – such as a 32-camouflaged-gate block, 30 gates of which they managed to uncloak.

Magnesium Magic

The next paper came from a team out of New York University. They started by positing that camouflage really works only if more than 50% of the logic is cloaked. That’s an expensive proposition. They took a completely different tack, camouflaging vias instead of gates.

The idea here is, in my humble opinion, pretty dang clever. Every valid via gets magnesium in the stack. Fake vias are also created, but, instead of pure Mg, they get magnesium’s oxide – MgO. The point is that the good vias conduct, since Mg is a metal, and the fake vias don’t conduct, since MgO is an insulator. So the fake vias in no way affect the function of the circuit.

Here’s where the fun part starts: what happens if you de-layer the circuit to see which vias use metal and which use oxide? Well, it turns out that magnesium oxidizes readily in air. So within a few seconds of exposing the true via to oxygen, its Mg turns to MgO, and now it looks like a fake via.

It hasn’t escaped my attention that using a pure nitrogen environment or even a vacuum might thwart this little trick. But my guess is that the equipment used for de-layering hasn’t been designed for use in a rarified environment. In all likelihood, they’re built to be used in the ambient.

Remember also that you need to preserve the pure Mg until you’ve completed the inspection. So it’s not enough simply to de-layer away from oxygen; the inspection must also avoid oxygen, and there can be no oxygen when moving from de-layering to viewing. So it’s theoretically possible to bypass this protection, but it would take a lot of effort and expense. (Of course… depending on the value of the chip, someone might think it worth that expense, since it would be a one-time thing that could then be used on other lower-value chips.)

The other helpful bit here is that both Mg and MgO readily dissolve in the presence of liquids, like the etchants used for de-layering. Perhaps not insurmountable, but also a hassle to circumvent.

Fake Feedback

Our next story comes from Northwestern University. It challenges yet another camouflage attempt: that of adding strategic combinatorial cycles to an otherwise acyclic combinatorial circuit. This is also referred to as cyclic logic encryption.

It seems like encryption because there will be a particular internal state that disables the cycle; that state is effectively a key. For instance, a cycle can be fed back through a two-input AND gate where the other input is one of the bits of the key. If that signal is low, then the cycle cannot pass through the AND gate and is blocked.

As they note, cycles can contribute to combinatorial logic, but they can also result in oscillation (if feedback is negative) or in storage of state (if positive). SAT-based approaches can deal with combinatorial cycles; they can’t handle oscillation; and, because more than one value might satisfy a stateful cycle, they may not finish in that case.

So the team came up with two different algorithms to help deal with cycles. The first is called Structural CycSAT, and it works by identifying the key value that eliminates the cycles. Once that key is known, it can be delivered back to the SAT solver as a constraint, and the SAT solver can then finish its job.

But eliminating that cycle may not be enough: the unlocked circuit may still have more cycles in it. So they have a different version of the algorithm, called the Logic CycSAT algorithm. It takes on the extra work of removing additional cycles. Because structural cycles are more common, and, because the two algorithms are similar, they implemented only the Structural CycSAT algorithm. The second algorithm may be related in spirit to the first, but it’s more complex.

Think Threshold

Our last report comes from the University of Massachusetts. It builds on an idea that uses threshold voltage choices as a way of obfuscating a key. Exactly how this works wasn’t quite clear to me from the report itself, so I had to check a few earlier papers, which helped only partially. But I think I have the gist.

The basic idea is that you can implement a circuit using high- or low-threshold transistors strategically. If you take a NAND gate designed to work with nominal-threshold transistors and modify the threshold of one or both transistors, you may end up with a “gate” that’s always high or always low at the output. This can be used as an always-on or always-off switch.

The current paper talks of using arbitrary threshold voltages, while acknowledging that, often, a real-world designer will have a choice of pre-determined thresholds only. So what does using arbitrary values offer?

The earlier papers talk about having most transistors with nominal threshold, while certain select ones use a high threshold and others a low threshold. You can, in fact, place several logic functions in parallel, using these switches with various combinations of thresholds, such that only one of the functions will be selected. You just can’t tell which one from inspection, or from any means other than figuring out the thresholds of the different transistors.

What might confuse one’s thinking is perhaps believing that this is a problem of matching the live signal voltage to the appropriate threshold. In other words, if a transistor has a high threshold, then it must be driven by a higher voltage to get it to work properly. This, of course, would create the problem of having to create multiple voltages on-chip to feed the correct signals. That would be expensive, and it would also be straightforward to tell which lines were driving which transistors.

So that can’t be what’s up. Instead, this seems to be a simpler idea – as demonstrated in the earlier papers – that you have three thresholds: nominal, high, and low. And that’s it. So, either the driving signal meets threshold or not, and the circuit either works or doesn’t.

If that’s the case, then what’s this thing about using arbitrary voltages? That gets to what’s different about the current work. It is possible to measure the threshold voltage of a transistor invasively. But one can do so with only so much precision. So the arbitrary thing comes to this: what delta should be added to or subtracted from the nominal threshold to move the threshold up or down? The closer the high and low values are to nominal, the harder it will be to read out the transistors accurately.

That said, if the high and low values are placed close to nominal, then there is less certainty that the circuit will behave as expected, due to process and other variations. So the smaller the delta, the more secure – but the less reliable – the solution.

What the UMass work does is to add an error-correcting circuit that can address the reliability concern. That allows smaller deltas without loss of reliability.

The whole arbitrary thing comes from the need to optimize the threshold delta and error-correction parameters. If you have perfect fab flexibility when it comes to threshold, you have an infinite range for adjusting threshold. Any particular value comes with a reliability level and, therefore, the level of error correction needed. Error correction consumes silicon area, so it’s not free. Hence the need for these tradeoffs.

This particular approach invokes Kerchhoff’s Principle (which is not to be confused with the completely unrelated Kirchhoff’s Laws). That principle says that security is best if it doesn’t come through obscurity, but rather from the strength of, say, encryption itself.

So, in this particular case, if the protected cells store an important key, then the attacker can know everything about the approach – process node, exactly how the error correction works, and everything else – except specifically which transistors have which thresholds. An attack relies solely on the ability to determine those thresholds.

Which is why, if you can afford more error correction, then using small threshold deltas will make the thresholds harder to reverse engineer.