editor's blog
Subscribe Now

Faster Simulation on GPUs

At last week’s SNUG, I had a chat with Uri Tal, CEO of startup Rocketick, about their simulation acceleration technology. What they do bears some resemblance to the parallelization semi-automation done by Vector Fabrics or the exploration done by CriticalBlue, except that here it’s working with Verilog instead of C and it’s fully automated and transparent to the user. He claims they can accelerate simulation by over 10X.

They use a GPU to achieve this kind of parallelization. This has promise both for in-house simulation farms and cloud-based simulation, where GPUs are available (although the cloud hasn’t been their focus).

What they do is create a directed flow graph (DFG) from the Verilog code and then go through and figure out which parts they can accelerate. Each such part becomes its own thread for the GPU. The acceleratable parts tend to be the synthesizable portions of the code (as hardware logic tends to be highly parallel). They do this on a statement-by-statement basis while keeping an eye on the dependencies – if there are too many dependencies, they may change the partition to reduce the size of the dependency cutset. What is left unaccelerated either couldn’t be accelerated or simply didn’t make sense to accelerate.

So, based on this, the tool converts a completely unaccelerated simulation into portions that are set aside for the GPU and the remaining bits that are re-generated for standard simulation. The accelerated portion is attached to the simulation using PLI.

The accelerated threads are turned into a byte code that is executed by a run-time engine. This makes the accelerated “code” portable onto any platform; only the runtime engine must be ported. They also manage memory carefully: the GPU uses very wide-word memory, so random byte accesses can be very inefficient; they manage the memory on a per-thread basis to get as much as possible out of each memory read (or write).

The accelerated threads dump all the usual files for later analysis by viewers and debuggers. They interface directly with SpringSoft’s Siloti to identify “essential” signals.

You can find more on their website.

Leave a Reply

featured blogs
Jul 1, 2022
We all look for 100% perfection and want to turn our dreams (expectations) into reality as far as we can. Are you also looking for a magic wand to turn expectation into reality? The story applies to... ...
Jun 30, 2022
Learn how AI-powered cameras and neural network image processing enable everything from smartphone portraits to machine vision and automotive safety features. The post How AI Helps Cameras See More Clearly appeared first on From Silicon To Software....
Jun 28, 2022
Watching this video caused me to wander off into the weeds looking at a weird and wonderful collection of wheeled implementations....

featured video

Synopsys PCIe 6.0 IP TX and RX Successful Interoperability with Keysight

Sponsored by Synopsys

This DesignCon 2022 video features Synopsys PHY IP for PCIe 6.0 showing wide open PAM-4 eyes, good jitter breakdown decomposition on the Keysight oscilloscope, excellent receiver performance, and simulation-to-silicon correlation.

Click here for more information

featured paper

3 key considerations for your next-generation HMI design

Sponsored by Texas Instruments

Human-Machine Interface (HMI) designs are evolving. Learn about three key design considerations for next-generation HMI and find out how low-cost edge AI, power-efficient processing and advanced display capabilities are paving the way for new human-machine interfaces that are smart, easily deployable, and interactive.

Click to read more

featured chalk talk

Powering Servers and AI with Ultra-Efficient IPOL Voltage Regulators

Sponsored by Infineon

For today’s networking, telecom, server, and enterprise storage applications, power efficiency and power density are crucial components to the success of their power management. In this episode of Chalk Talk, Amelia Dalton and Dr. Davood Yazdani from Infineon chat about the details of Infineon’s ultra-efficient integrated point of load voltage regulators. Davood and Amelia take a closer look at the operation of these integrated point of load voltage regulators and why using the Infineon OptiMOS 5 FETs combined with the Infineon Fast Constant On Time controller engine make them a great solution for your next design.

Click here for more information about Integrated POL Voltage Regulators