A Happier Muse

Many of the arts and skills developed throughout the ages have credited much of their inspiration to muses and patron saints and gods and spirits. These other-worldly beings provided both inspiration and guidance as the artisans built up a vast cultural legacy going back millennia.

So… if there’s a muse for engineering, who would it be? The thing about engineering is that free flights of fancy are often not permitted. We’re constrained by the possible, in stark contrast to experimental artists and even pure scientists. In fact, it’s worse than that: we have to make money. Usually for someone else. Which puts a further damper on things. Stuff has to work under a wide range of conditions, some predictable, some not.

In this regard, if I had to name a muse or patron saint for engineering, perhaps it should be Eeyore: “Assume the worst.”

“Yeah, my circuit goes really fast if the voltage is high enough. But it probably won’t be.”

“I have a really cool idea, but it will work only if things stay cool enough. Which I probably can’t count on…”

“I think I’ve designed this to where power will be stable, but I can’t be absolutely sure… so yeah, I better assume it won’t be. Damn… there goes my performance.”

Obviously, designing for worst case has been instilled deep into our DNA. It has to be. It’s what makes the difference between something that can work once in a lab and something that works reliably every day in our homes or cars or pockets or factories.

And what that used to mean is simple: take all the worst-case conditions and make sure your circuit can work within spec. If not, you have three choices: fix the circuit, fix the spec, or lie about it and hope that no one notices until you’ve made your millions, sold the pieces of the company off so that there are no deep pockets to go after, and moved your own personal deep pockets safely onto some untouchable Caribbean island.

But these days, design windows are huge. Process variation is killing us. Statistics are the archenemy. If we were simply to design everything to absolute worst-case, nothing would work. I mean, yeah, maybe you could get something out, but it’s like trying to target content at a website. If you have no idea specifically who will be accessing your content, then you have to make sure that it’s fit for five-year-olds, which eliminates a lot of legitimate content. If you know more about the details of who is going to be there, then you can target more specifically, and you don’t have to assume worst-case anymore. (With apologies to any five-year-olds reading this that are offended by being characterized as worst-case.)

Which is why EDA tools have gotten much better about refining the questions we’ve been asking. Instead of, “What’s the worst that can theoretically happen,” we’re moving to, “What’s the worst that actually could happen in real life?” The challenge is that this second question, while more precise and practical, requires much more information to answer.

Specifically with respect to power, it used to be a simple calculation of min/max static V_CC: do your simulations, and you’re done. But with dynamic power and with far more aggressive power grids, it’s not that simple anymore. Ground bounce has long been a known phenomenon, although largely related to simultaneously-switching outputs. But these days, ground bounce and V_CC droop matter. They matter at every node.

So we’ve had power integrity analysis to check out how the power grid responds during circuit operation. That can accomplish two things: it lets you fix weak parts of the power infrastructure and it gives you some information about how much droop or bounce you might have to design for.

In other words, you can establish a worst case. And then assume worst case for the entire design.

But the thing is, worst case won’t happen everywhere. There may be specific pinch-points in your design that are as delicate as five-year-old sensibilities, but they’re not everywhere. It would be much better if you could design each node for exactly the conditions it could truly see rather than treating every node as if it targeted a five-year-old.

But that takes some very laborious calculations to do. Plus it takes multi-tool interaction so that the results of the power analysis can be worked into the results of the timing (or whatever) analysis.

Which is specifically part of what Cadence is trying to enable with their newly-announced Voltus tool. (And I think they’ve addressed the five-year-old market simply by naming this tool as if it were a superhero toy. Perhaps when launching the tool, you type on the command line, “C:/Voltus Rage: ENGAGE!!”

Now, this is one of the several recent Cadence announcements featuring a 10X improvement. Part of me says, “Wait, how can so many different development products on different tools all end up exactly at 10X better?” And then the marketing part of me re-awakens and realizes that, (a) you’ve got to be 10X better to move most needles and get someone to switch to you, and (b) round numbers are really convenient. So here we are again with a 10X performance improvement, this time on power integrity analysis.

How do they do that? Well, their basic engine has improved by 2-3X, but most of the leverage comes from enabling parallel computation. Which isn’t easy because you can’t parallelize this using data partitioning (where each thread or process does the same thing to its own subset of data); it’s more of a functional partitioning thing, they say. And it’s a proprietary approach that they’re not keen to talk about. But, critically, it does allow for machines with moderate amounts of memory to work on the problem together without a horrible amount of data rumbling all over the network between machines.

So faster is good. But they also enable this to integrate with Tempus for digital timing analysis, Virtuoso for custom circuits, and Sigrity for package and PCB design. The point here is to get design closure more quickly, and if you have to assume absolute worst case for everything, then it takes much longer to get closure – if you even can.

They approach this by allowing early grid evaluation in the floor-planning stage so that the grid and circuit can be co-designed. Power optimization is physically-aware, again with a focus on what is really likely to happen at a particular node rather than some theoretical bounding assumption. And they claim SPICE-level accuracy –with the Voltus/Tempus combination forming a “unified” signoff package. So they’re presumably not taking computational shortcuts to do this.

Of course, finding out where circuit weaknesses might truly be is only half the battle. It won’t be right until you fix the problem. And, at present, that means taking a node that’s operating too slowly due to a drooping V_CC, for instance, and making it faster.

But what if that node isn’t really the issue? What if the real problem is that, for example, too many signals are switching and causing V_CC to droop, de-powering the node in question? Should you then over-design that node to handle the droop or fix the real problem – too many switching signals?

This latter approach isn’t explicitly enabled so far from the standpoint of Tempus and Voltus working together to establish root cause, although the Cadence folks are discussing it. We’ve seen other approaches to this, moving edges around to solve exactly this kind of problem, but it’s not something currently integrated with Voltus (or even made by Cadence). Presumably there’s more to come here.

For now, at the very least, perhaps we can stop being so dang pessimistic and start looking for a new muse. One that’s more willing to say, “OK, this actually might work!”

More info:

Cadence Voltus