Clocks, Xs, and Resets

Does it ever feel to you like, no matter how many new tools features appear, it’s never quite enough? In some cases, you solve one problem – but it doesn’t stay solved. Complexity, scaling, performance and power requirements – they may all catch up to the solution, necessitating further tool refinement. In other cases, new problems arise that were never dealt with before.

We’ve got some examples of each, distilled from a conversation with Real Intent at DAC. Let’s start with problems that have already been solved.

Danger! Clock Crossing Ahead!

I remember in my days back at Monolithic Memories and AMD. I was doing a seminar tour, and our technical topic (because no one likes a day-long sales pitch) had to do with metastability. And this was a new topic to many. During this era, there were even hopeful architectural proposals that aimed to eliminate metastability risk entirely. (Spoiler: you can’t; you just push the problem from one parameter – like a setup time – to another – like some pulse width).

As an industry, we gradually came to grips both with the notion of metastability and the need to design circuits that are robust in the face either of events whose timing is truly random or events originating from different clocks that are mutually asynchronous – that is, they aren’t derived from some common ur-clock way back upstream. (Or they have a common source, but funky clock gating makes for funky timing.)

Back in the day, this mostly related to signals coming onto and off of what would seem today to be ridiculously simple circuits. As integration grew, however, it became feasible to host more than one clock domain on a circuit. FPGAs wrestled with this in the early days; now it affects pretty much any large ASIC or SoC.

And so we got tools to test the robustness of the so-called clock-domain crossings (CDC), where a signal moves from synchrony with one clock into a new zone where it needs to be resynchronized with another clock – without causing any metastability or other glitch. Real Intent has had such a tool (Meridian CDC), based on formal analysis techniques, for a long time.

And yet, it’s still not good enough. Now, not only do we have potentially large numbers of clock domains – we also have the notion of modes. At a high level, modes are obvious: I’m a phone and now I’m going to do GPS work; in a minute I’m going to stream a video. But deep inside, this gets implemented by multiplexing different clocks into registers. Now what do we do for analysis?

Real Intent describes a couple of typical approaches:

Just worry about the worst-case modes. The risk here is obvious: you may miss a mode that matters.
Separate out all the modes and run them one at a time. This is doable, but laborious – both in setup and in having to pore through all the results. It also obscures possible links between modes, with a fix to one mode screwing up another one. Call it Whack-A-Mode.

And it gets even harder than that. Depending on how clocks are muxed, there may be a mode where one clock drives both source and destination registers – in other words, not a domain crossing. But in another mode, those two registers might have different clocks – now it is a domain crossing. And that might change dynamically.

In other cases, tracing out the logic at the clock input to a set of registers might reveal that, despite the fact that you can choose from multiple clocks, that choice becomes exclusive to the entire set of registers. Choosing clock A in one place means clock A everywhere in that register set, and similarly with every other choice. So there’s never a domain crossing here, despite the proliferation of clocks.

And, of course, you can combine them all. You could have one mode that keeps things synchronous by selection; another that makes for asynchronous crossings; and another that guarantees synchronous behavior using logic that guarantees exclusivity. And, again, they could change dynamically.

Simplistically speaking, you could dream up an uber-tool that could simply crush all of this together into a solution. But, without direction, it would start to resemble a cross-product of clocks, generating an uber-ton of work and a further uber-ton of results that you have to examine, much of which will be meaningless or repetitive.

So Real Intent has done this thing they call Static Intent Verification (SIV). It infers the clocking intent at potential domain crossings, eliminating pseudo-modes that couldn’t exist and pruning the list of things to check to only those that involve actual domain crossings. If there’s a mode where a particular register pair operate synchronously, then no analysis will be wasted on that situation.

The result is their newly announced Verix CDC product, which they claim to be the first true complete multimode solution. Not just the worst modes, not modes one-by-one, but all modes all at once. The only reason this is tractable is the SIV thing. They say it collapses the analysis time for a single mode from a week to a day. Going multimode adds about 20% to the analysis time. They also compact the results they report so you’re not spending time on redundant or phantom issues.

While what we’ve been discussing happens at the RTL level, it’s also possible for synthesis to create new crossings – and you have to analyze that at the gate level. Real Intent calls this “physical CDC,” and they plan to have it available by the end of the year.

X Static

Other familiar ground is the case of so-called X-propagation – something that happens in the verification world, but not in the real world. In the real world, every digital signal has a solid state – 1 or 0. But, when simulating, we don’t always know which until some logic path has determined the state. So those unknown states are “X.”

If you really want to be thorough, then you want to be sure that, no matter the actual physical value that this X eventually takes on in real silicon, the circuit will work. And that verification can be a lot of work. And, like so many other verification issues, you may end up working on things that don’t matter. If an X propagates for a few cycles and then peters out because some other path dominated, then does it matter? Very zen…

Real Intent says that many designs are overly pessimistic about their Xs (not to be confused with being overly pessimistic about one’s exes…). This makes for very long simulation times – but the real problem is that, with each run, you have to examine results and prune away meaningless stuff and then rerun the verification. Over and over – it can take months.

In an upcoming tool (in early access now, but due out before long), they claim to do a better job of isolating those X-propagation scenarios that really matter. Unintuitively, this actually adds 10-20% to the run time. But the payoff is that you have far fewer runs to do, reducing full-chip analysis from around two months to one month.

And Now We Have R Crossings

Now that we’ve looked at the refinements needed for two existing verification applications, we come to one that we haven’t worried about so much before: reset. Yeah, reset. That annoying signal that you always forgot to connect to something on that 7400-series chip from the TI databook. Cuz you just want to worry about the signal, not that extra stuff. (With apologies to anyone whom I just lost with the 7400 reference…)

And we’ve been grumblingly dutiful about making sure to connect our resets ever since. But we couldn’t leave it at that, could we? Of course not. Now we’re faced with multiple resets – giving rise to the concept of a reset domain. And a single clock domain may have multiple reset domains.

When checking resets for risk of metastability, it’s harder than with clocks. For CDC, you compare one clock to another. With reset, you have to compare different resets – and then you also have to check each one relative to any clocks. Because, yes, if you mess up the reset and clock timing, you can go metastable.

They have a set of risk mitigation strategies, which provide reset circuits that are aware of safe and unsafe paths. Using these, you should end up with robust circuits that won’t create any in-field surprises.

In case you thought things couldn’t get any squirrelier, they can: superimpose this reset situation on a multimode circuit. And… that’s a bridge too far at the moment. For now, their reset-domain crossing (RDC) solution is single-mode.

This solution was actually announced last November, so it’s on the shelves today.

More info:

Real Intent