editor's blog
Subscribe Now

Faster Simulation on GPUs

At last week’s SNUG, I had a chat with Uri Tal, CEO of startup Rocketick, about their simulation acceleration technology. What they do bears some resemblance to the parallelization semi-automation done by Vector Fabrics or the exploration done by CriticalBlue, except that here it’s working with Verilog instead of C and it’s fully automated and transparent to the user. He claims they can accelerate simulation by over 10X.

They use a GPU to achieve this kind of parallelization. This has promise both for in-house simulation farms and cloud-based simulation, where GPUs are available (although the cloud hasn’t been their focus).

What they do is create a directed flow graph (DFG) from the Verilog code and then go through and figure out which parts they can accelerate. Each such part becomes its own thread for the GPU. The acceleratable parts tend to be the synthesizable portions of the code (as hardware logic tends to be highly parallel). They do this on a statement-by-statement basis while keeping an eye on the dependencies – if there are too many dependencies, they may change the partition to reduce the size of the dependency cutset. What is left unaccelerated either couldn’t be accelerated or simply didn’t make sense to accelerate.

So, based on this, the tool converts a completely unaccelerated simulation into portions that are set aside for the GPU and the remaining bits that are re-generated for standard simulation. The accelerated portion is attached to the simulation using PLI.

The accelerated threads are turned into a byte code that is executed by a run-time engine. This makes the accelerated “code” portable onto any platform; only the runtime engine must be ported. They also manage memory carefully: the GPU uses very wide-word memory, so random byte accesses can be very inefficient; they manage the memory on a per-thread basis to get as much as possible out of each memory read (or write).

The accelerated threads dump all the usual files for later analysis by viewers and debuggers. They interface directly with SpringSoft’s Siloti to identify “essential” signals.

You can find more on their website.

Leave a Reply

featured blogs
Nov 15, 2019
As we seek to go faster and faster in our systems, heat grows as does the noise from the cooling fans. It is because of this heat and noise, many companies are investigating or switching to submersible cooling (liquid immersion cooling) options. Over the last few years, subme...
Nov 15, 2019
Electronic design is ever-changing to adapt with demand. The industry is currently shifting to incorporate more rigid-flex circuits as the preferred interconnect technology for items that would otherwise be off-board, or require a smaller form factor. Industries like IoT, wea...
Nov 15, 2019
"Ey up" is a cheery multi-purpose greeting that basically means "Hello" and "Hi there" and "How are you?" and "How's things?" all rolled into one....
Nov 15, 2019
[From the last episode: we looked at how intellectual property helps designers reuse circuits.] Last week we saw that, instead of creating a new CPU, most chip designers will buy a CPU design '€“ like a blueprint of the CPU '€“ and then use that in a chip that they'€™re...
Nov 15, 2019
Last week , I visited the Cadathlon@ICCAD event at the 2019 International Conference on Computer Aided Design . It was my first CADathlon and I was quite intrigued , since the organizers webpage... [[ Click on the title to access the full blog on the Cadence Community site. ...