Formal: It’s Still Hard

Formal verification has had a rough go over the years. It came onto the scene with tons of promise, and that promise remains, to some extent, promise. It’s hard to argue against the possibility of taking a spec and a circuit, pushing a button, and watching while some tool provides proof that the circuit is correct.

And watching… and watching… and, well, yeah, that’s where the problems cropped up. Waiting for an answer to some of these problems, if not done carefully, can require suspended animation so that you can be reawakened in some future year or century when the calculation is complete.

Which meant paring down the size of circuits that can be handled, and knowing how to do that properly was hard. So the various companies that make use of formal technology – names like Jasper and Real Intent and OneSpin and even Mentor – had to go back and simplify how the tools are used.

So what you see now are lots of specific applications built over formal technology. Whereas the technology used to be the focus, now the specific problem/solution becomes the focus. The directing of the formal technology to solve that specific problem has already been done, so many of these apps are push-button, to a large extent.

Problem solved, then… right? Well, yes – for those specific problems addressed by those specific applications. So does that mean that the initial promise of formal verification has now been reduced only to the things that can be made push-button? Not necessarily. But the initial challenges remain, making this not for the faint of heart.

All of the formal applications are for verifying specific things as a complement to simulation and other verification techniques. But some folks are trying to leverage formal technology in a manner that makes simulation unnecessary – at least at the block level.

I learned about this in a conversation with Jin Zhang of Oski Technology*. They were introduced to me as a company specializing in formal verification consulting. That jogged me out of my “formal is easy these days” bliss, because if you need consulting for a specific technology like formal, that means it’s hard. And she confirmed this. And the things that are hard are the things that have always been hard.

She describes three levels of formal tool as layers on a pyramid, where the full pyramid represents all the things that could theoretically be done using formal technology. Along the bottom is a stripe for fully-automated applications like clock domain crossings. You can think of these as super-lint or lint-on-steroids tools: they check for very specific problems with your RTL code, and mostly they run with a simple push of the button.

Next is a layer of semi-automatic tools – ones that require some input. An example of this is equivalence checking to ensure that clock gating has been done correctly. And above this layer? Well, here there be dragons. The rest is the wild, wooly space of things that are hard. And that’s where Oski plays. They address the classic issue: verification that would take far too long without hard, specialized work – work that most companies don’t have the expertise to do.

Here’s the deal: full-on equivalence checking means comparing an implementation (your circuit) to a known-good model (from a spec). But doing an exhaustive check means exploring a design space that explodes exponentially as the circuit grows. It rapidly becomes untenable.

Formal vendors have worked hard at taking large circuit blocks and pruning away things that don’t matter, focusing on the cone of influence to narrow the search space. But even so, a subcircuit with only 30 flip-flops will have over a billion states. (OK, slightly over a billion: 1,073,741,824). Cones of influence are extremely likely to have far more flops that this, and so you’re still stuck with an unwieldy solution.

These problems can be made tractable, but that takes some specialized work. Each project that Oski takes on has two major aspects: generating the reference model and abstracting parts of the circuit.

The reference model seems like an obvious thing, but it’s not. The whole point of formal equivalence checking is to prove that a circuit is identical to a known-good model, so you first need that known-good model – and they don’t grow on trees. Typically, this model must be generated from a paper spec. And with the complexity of some of these blocks, this is not trivial. For instance, Oski modeled ARM’s new cache coherency protocol – a project that took about six months.

Of course, this assumes that a paper spec exists. Some companies have a “just do it” philosophy exhorting folks not to write a spec; just write code. So then it becomes an even harder task to assemble a reference model that isn’t itself flawed. The good news is that this process can often uncover problems and ambiguities in the spec.

Once a golden reference is in place, then the circuit can be tested against it. But in order to get a solution in our lifetime, they have to work on the circuit to add abstractions. You might think that this means simplifying various parts of the logic, but, if anything, it means making them larger or more general.

Jin illustrated the counterintuitive nature of this with an example of a resettable counter. Such a counter can take on any value – by resetting and then counting up to the desired state. The effort it takes to acquire any arbitrary state can be reduced by replacing the specific counter with a preloadable version – one that does everything the original counter does, but also allows any arbitrary state to be preloaded directly.

It’s extremely important that the abstracted circuit be a strict superset of the original circuit. You have to add the preload feature, for example, to the specific counter circuit being tested; you can’t replace the whole thing with some other counter. Otherwise, if there were a bug in the original counter, you’d miss it by replacing it with the other counter. So you have to augment the existing circuits, bugs and all, to ensure that you’ll be able to find those bugs.

Of course, this means proving that the abstracted circuit is a strict superset, and that becomes part of the methodology. These abstractions take problems that would take years to converge (or might even have trouble converging) and makes them doable in seconds to hours. Of course, there’s a balancing act – the added logic can result in false negatives, so you want to abstract, but not too much.

Ultimately, given a good reference model and a well-abstracted circuit, you can prove in a reasonable timeframe that a circuit is good (or not) with 100% coverage. And this is what they call an end-to-end proof – once it’s done, you need no simulation of the block. The fact that they can attain 100% coverage – including bugs that would likely not have been found through simulation – makes this what they call a sign-off-quality process.

Not that simulation is going anywhere: the full-up SoC, created by assembling these already-verified blocks into a whole, still requires simulation and other techniques to pass muster. The scale of such a circuit is still beyond what formal can reasonably accomplish by itself; “end-to-end” refers to the ends of the block, not of the IC.

So why can’t any company simply do this themselves? In theory, they can. But, according to Jin, doing this effectively requires people that live and breathe formal; you need dedicated engineers. An engineer who jumps back and forth between formal and simulation paradigms is likely to have a much harder time. So larger companies may form dedicated formal analysis teams; smaller companies, which rely on much more hat-swapping and employee versatility, are less likely to be able to afford that luxury.

Or so Oski believes. They’re still proving their value to companies, with the most effective demonstrations happening when they start finding surprising bugs. That gets people’s attention.

You might also wonder which formal tools they prefer, and the official answer is “none; we’ll use whatever our customer wants us to use.” Oski was founded by the founder of Jasper, so there’s probably a bit of proud papa there, but they say that all of the tools out there have improved their run times and capacity so that they’re all viable. Yes, there are subtle differences in timing, but the big win comes not from using one tool that runs slightly faster, but from the abstractions that slash orders of magnitude off the run time. Everything else becomes less important.

All in all, the lesson from this is that, yes, formal is still hard. Applications have made some very specific uses of formal technology more accessible to the everyday engineer, but fully exploiting the promise of formal verification still takes a cadre of experts to transform a ridiculously tough problem into something doable.

* For you Bear fans, yes… it’s that Oski… with a gratuitous shout out to my son, working at Camp Oski – as soon as the smoke clears enough for folks to go back

More info:

Oski Technology