feature article
Subscribe Now

Spicing Up Simulation

Infinisim and Gemini Enter the Analog Fray

In yet another example of the ascendance of analog considerations, simulation of analog behavior – whether in outright analog circuits or in the secret analog life of digital circuits – has risen to the level of problem that needs solving. SPICE is the mother’s milk of analog simulation, but, in the spirit of actually getting things done in a finite amount of time, SPICE has been divided into the “Fast SPICE” side of things, where you trade off some accuracy for the ability to see results sooner, and full SPICE, which takes longer to run but provides more accurate results.

In case you’re wondering about how long some of these full simulations can take, apparently a PLL simulation for a spread-spectrum circuit can require as much as a month. The need to get fast, accurate results is motivating a couple of different approaches intended to break beyond today’s performance limitations and start approaching Fast-SPICE-like speeds with full-SPICE accuracy. They are two different ways of trying to shake up the stolid, venerable world of Berkeley SPICE and move it forward several steps.

Part of both solutions reflects the need to deal with the growing amount of matrix math required as circuits get bigger. There are two primary components that chew up time during simulation: model evaluation and matrix calculation. During model evaluation, the various models of the components in the circuit are consulted to return key values; matrix calculation provides the ultimate simultaneous solution of all the nodes in the circuit. For small circuits, the model evaluation delay dominates and the matrix calculation is manageable. And for large circuits, it’s the matrix calculation that becomes problematic. Since today’s problem is ever larger circuits, the matrix gets some attention in both cases.

The first company to announce a new solution, Infinisim, addresses this by first splitting the full circuit into blocks; each block gets its own matrix. The theory here is that multiple smaller matrices plus boundary calculations are faster than one honkin’ matrix calculation. The partitioning is done by starting with a grid and then accreting blocks from the grid based on how strongly connected they are. The idea is to minimize connections between the blocks, since these form the boundaries that will eventually have to be resolved.

Now any of us that had woodshop in our childhood has had to solve the problem of which saw to use. Doing the rough cuts with a coarse hand saw gets the job done quickly, although it lacks elegance. It’s much more fun to bring out the coping saw and lovingly hone those fine, intricate curves. But what do you do when you have to cut something that has a variety of contours on it? If you use the basic handsaw, you’ll never make it around the tight corners. But if you use the coping saw, you’ll be forever on the long boring stretches. If you can, you really want to be able to swap saws along the way: recruit the rough-toothed saw for the long straight stretches so that you can remove wood as fast as possible there, while pressing a fine-toothed saw into service when entering the filigree zone.

While this may not seem like rocket science, this concept has in fact been recruited for use by Infinisim. Traditionally, when doing SPICE simulations, you pick a time increment, and the circuit is recalculated at each time increment. The question is, which time increment? Let’s say you’re simulating a circuit that culminates in a digital output. While numerous interesting things may be going on inside the circuit, the output will hold onto its stable value until conditions determine that it’s time to change value. It doesn’t really make sense to recalculate the output stage if its inputs haven’t changed.

This forms the basis for Infinisim’s primary secret sauce: their simulator contains a number of different solvers, each of which has different characteristics, along with a controller that decides which solver to use in which situation. First off, one choice available to it is to use no solver at all if the inputs to the block haven’t changed. This allows a rough-cut approach for steady signals – the timing granularity is effectively extended when nothing’s happening so that you don’t end up performing multiple useless calculations. When a recalc is needed, the controller decides which solver to apply. And, assuming iteration is needed at a given point to get convergence, the controller can select the solver on a per-iteration basis, changing as it deems appropriate during the converging process. Bottom line is, when a signal is changing slowly, a coarse timescale is used; when it’s changing quickly, a finer timescale is used.

One of the key effects they’ve found with this approach is better scalability. While the time required for full SPICE has historically grown much faster than linearly as the circuit being simulated grows, Infinisim’s approach grows linearly, more consistent with Fast-SPICE scaling but with the accuracy of full-on SPICE. They claim an average 50x improvement in performance using this algorithm.

Note that this approach doesn’t take advantage of parallelism to get speed. In fact, they believe that synchronization requirements would keep parallelization from really being effective. However, that’s exactly the approach that another newcomer, Gemini Design Technology, is taking in their completely different solution to the problem.

While they have made some fundamental algorithm improvements, Gemini has focused primarily on the matrix. Dr. Baolin Yang, one of the founders, developed a way of parallelizing the matrix math – something that has apparently eluded mathematicians before now. Diagonal matrices – that is, ones whose non-zero values are concentrated along the diagonal of the matrix, being sparse elsewhere – are relatively easy to calculate. Relatively. But with the growing complexity of circuits and the number of parasitics being modeled, non-diagonal portions of the matrix have become less sparse. Dr. Yang found a way to break the matrix into a number of “sub-diagonal” matrices (i.e., pulling smaller matrices out from the diagonal region) plus one more matrix consisting of the non-diagonal elements. These calculations could then be parallelized and reintegrated into a single result. Note that this way of generating multiple smaller matrices appears to be completely unrelated to the partitioning that Infinisim does.

Using this approach, they have not found synchronization to be an issue. No special tricks were called up to avoid shared data races; standard locking techniques are used. The breaking up of the matrix provides enough data independence to avoid getting bogged down in the writing of shared data. They achieve more or less a linear improvement in speed as more cores are added to the computing platform.

The speed-up number Gemini claims as a result of this is up to 30x faster than old-school simulators. Compared to what they call “first-generation” multi-threaded simulators, they see a 2-10x improvement. Whether this means that Infinisim’s 50x trumps Gemini’s 30x isn’t obvious, as anyone who has tried to make sense of benchmarks will know. Both clearly make a leap beyond what can be done with traditional approaches. It’s not even clear whether they knew about each other, since they’re targeting the incumbents in this area. Gemini in particular has been very specific about targeting Cadence’s Spectre as the “gold standard.” In fact, they think that in some cases they may be even more accurate than Spectre, but they are opting for good correlation rather than best accuracy, since for now they believe that the known Spectre result will get the edge over the still-proving-itself Gemini result, regardless of which is theoretically more accurate.

So the battle to usurp the SPICE crown is on. It’s particularly interesting that two totally different solutions are being applied to the problem. Portions of which could actually be considered orthogonal to each other. I mean… if they were to be combined, they could… aw geez, the patent lawyers would have a heyday. Never mind.

Leave a Reply

featured blogs
May 18, 2021
Since I was a kid, I’ve always been a fan of technology making life better. When I was 8, I remember programming the VCR to record the morning cartoons so I wouldn’t miss the good ones after the bus picked me up from school. When I was 10, I made mixtapes of my fa...
May 18, 2021
原文出è•: Please Excuse the Mesh: CFD and Pointwise ä½è…: Paul McLellan Cadence於今年四æˆæ”¶è³¼äº†æµé«”動力學公司Pointwiseã‚å¨æˆ‘的前ä¸ç¯‡æ–‡ç« æŽ¢è¨Ž PointwiseãPCIeã...
May 13, 2021
Our new IC design tool, PrimeSim Continuum, enables the next generation of hyper-convergent IC designs. Learn more from eeNews, Electronic Design & EE Times. The post Synopsys Makes Headlines with PrimeSim Continuum, an Innovative Circuit Simulation Solution appeared fi...
May 13, 2021
By Calibre Design Staff Prior to the availability of extreme ultraviolet (EUV) lithography, multi-patterning provided… The post A SAMPle of what you need to know about SAMP technology appeared first on Design with Calibre....

featured video

Insights on StarRC Standalone Netlist Reducer

Sponsored by Synopsys

With the ever-growing size of extracted netlists, parasitic optimization is key to achieve practical simulation run times. Key trade-off for any netlist reducer is accuracy vs netlist size. StarRC Standalone Netlist reducer provides the flexibility to optimize your netlist on a per net basis. The user has total control of trading accuracy of some nets versus netlist optimization - yet another feature from StarRC to provide flexibility to the designer.

Click here for more information

featured paper

Optimizing an OpenCL AI Kernel for the data center using Silexica’s SLX FPGA

Sponsored by Silexica

AI applications are increasingly contributing to FPGAs being used as co-processors in data centers. Silexica's newest application note shows how SLX FPGA accelerates an AI-related face detection design example, leveraging the bottom-up flow of Xilinx’s Vitis 2020.2 and Alveo U280 accelerator card.

Click to read

featured chalk talk

Benefits and Applications of Immersion Cooling

Sponsored by Samtec

For truly high-performance systems, liquid immersion cooling is often the best solution. But, jumping into immersion cooling requires careful consideration of elements such as connectors. In this episode of Chalk Talk, Amelia Dalton chats with Brian Niehoff of Samtec about connector solutions for immersion-cooled applications.

Click here for more information about Samtec immersion cooling solutions