feature article
Subscribe Now

From Simulation to Emulation

It Takes Three – er – Four to Tangle

Last year I tried to wade through the world of emulation to untangle it a bit. It all seemed so simple at the time. Once I had it untangled, that is. Problem is, I only thought I had untangled it. Cadence recently announced a “unification” (there’s that word again) between simulation, simulation acceleration, and emulation. And it became pretty clear pretty quickly that, in the intervening year, a new tangled web has replaced the one I thought I had cleared out before.

Bottom line, I got confused. Again.

OK, perhaps that’s not such a rare occurrence, but work with me here…

As I started talking around, it felt like I wasn’t the only one confused. Although, in truth, you never think you’re confused until you realize that other people adhere to other beliefs with the same level of conviction that you adhere to yours. So everyone has more or less a clear opinion, it’s just that they don’t all align. And so I get confused. And so, in turn, I try my best to share my confusion with the others in an attempt to illustrate that there’s confusion.

And this time I didn’t get much pushback. Yup, folks: it’s confusing.

The real question here is, if these three items are being unified, what’s the distinction between them?

•  Mentor’s Jim Kenney suggests that using real I/O instead of modeled I/O distinguishes emulation from simulation acceleration. Also, FPGA use means prototype; Cadence and Mentor use custom ASICs for emulation.

•  OK… now we’ve tossed prototype systems into the mix too. Why untangle three things when four will do?

•  And EVE ostensibly makes emulators, but they use FPGAs. Does that make them prototypes? Jim also mentions prototype systems being small. But EVE also builds large units. So which is it?

•  EVE’s Lauro Rizzatti says the simulation acceleration/emulation distinction is pretty fuzzy. And that Cadence’s new marketing drawings look a lot like EVE’s old marketing drawings. And that Mentor’s custom ASIC is really a custom FPGA, while Cadence’s custom ASIC is based on logic processors (which we saw last year to be LUT-based and hence also somewhat FPGA-like).

•  Cadence’s Michael Young says that if the verification environment involves a SCE-MI interface between the host and the hardware unit, then it’s simulation acceleration, not emulation. But, with no SCE-MI interface, you can’t talk to a hosted testbench (with any kind of speed), and only a subset of the testbench can be synthesized into hardware.

•  Synopsys sells prototyping systems, not emulators, but they support an emulator use model.


Of course, not everything everyone says conflicts, and there are threads of agreement here and there. So let’s try to walk through it and sort out what’s what.

Prototype this!*

And let’s start with prototyping, since we can put that one to rest somewhat more easily. Both prototyping systems and emulation systems – not to mention virtual prototype systems – allow software to execute on a model of what a chip will ultimately look like. The operative distinction appears to be, why do you want to do that?

If the answer is, “So that I can verify that my hardware is correct,” then you’re looking at an emulator. If the answer is, “So that my software developers can get started writing software earlier,” then you’re talking prototype.

A hardware prototype assumes that the RTL is relatively stable and is in the process of being implemented in silicon. Software development can start in advance of the chip being built and tested. A more abstract virtual prototype can be used before the RTL is stable; after that, a hardware prototype provides much greater performance. In fact, it runs much faster than an emulator – like an order of magnitude or more faster.

Prototypes tend to be smaller, using fewer FPGAs than, say, an EVE Zebu or Mentor Veloce (or Cadence Palladium if you think of their chips as FPGA-like) system. But they can run faster because someone takes the time and effort to create a very efficient implementation of the hardware. It makes sense to spend that much effort only once you know the RTL is pretty stable. The FPGA compilation process, not to mention performance closure, takes time, and so the whole exercise of implementing the prototype takes much longer than could be tolerated for a check-out of the hardware.

If you’re still testing the hardware RTL, then you want to be able to turn design iterations rapidly. This typically means less efficient implementation, using more FPGAs and achieving less speed. That’s where emulation sits. It’s all in the tradeoff.

And then there were three

So that leaves simulation, simulation acceleration, and emulation. To try to put them together into some reasonable structure, let’s pull things apart to make sure we know what’s going on.

The verification environment really consists of three elements:

•   the design/device under test (DUT)

•  the stimulus

•  the checkers (or whatever decides that things did or didn’t work)

The stimulus and checkers are typically packaged together as the “testbench.” They’re typically written in Verilog or SystemVerilog and are generally not synthesizable as a whole (although a subset may be).

The DUT is typically written in either of the HDLs and is synthesizable. If it’s not, someone’s going to be in trouble at some point.

There are two places where each of these things can be:

•  the host, using software models

•  hardware

Let’s take the easy case as an example. Simulation takes place entirely in the host. So the DUT, the stimulus, and the checkers are all implemented as software models. No big mystery there. Problem is, they tend to run slowly. Especially when you start trying to see how your hardware works when executing software; it runs achingly slowly in a simulator. You could get old just waiting for the system to initialize; hopefully you’ve trained your kids by then to take over when the actual software runs.

And so you use hardware to accelerate things. Because the DUT is synthesizable and the testbench isn’t, the obvious first step is to move the DUT into hardware. In fact, you might even be tempted to move a part of the DUT into hardware. But we have an important consideration to take into account: the interface between the hardware and the simulator. Because, in this scenario, the simulator session still rules the land; it’s just including the hardware as one of its minions.

Back when this sort of thing was new, the stimulus in the host would stimulate the DUT in hardware one toggle at a time, and the response to the checkers would come back one toggle at a time. The DUT may have executed quickly, but getting signals to and from the DUT became the bottleneck.

Enter an approach pioneered by IKOS prior to their acquisition by Mentor: transactions instead of individual signal value changes. This approach lies at the heart of the SCE-MI interface connecting the host and the hardware DUT. This dramatically accelerates the testbench’s ability to control the DUT.

But what if you split the DUT up, accelerating only a portion of it? Well, assuming that portion has to talk to the other unaccelerated portion, now you’re back to having to connect those signals across the bundle of wires between the host and the hardware. And there’s nothing to pack the signal changes into transactions. So this takes us back to achingly slow.

Meaning that the DUT will pretty much be found in its entirety either on the host or in hardware. The latter being the accelerated version.

So where does that leave emulation? Well, we still have this pesky tether to the host. Both Cadence and Mentor seem to agree that, in their view, an emulator stands on its own. It doesn’t talk to a host. There are two ways to achieve that: either put as much of the testbench as possible into hardware, or don’t use a Verilog testbench – use a “real” testbench.

Let’s break the testbench back apart into stimulus and checker portions. To stimulate your DUT (illegal in some states), you could have some signals coming from a synthesized partial-testbench, or, even better, use actual I/O cards that look like what will actually happen in the final system. If it’s network traffic you need, hook the DUT up to a real network through a real network card rather than using a simulated traffic generator. Need to talk to something that sits on the PCI bus? Use a real PCI card to talk to the real thing that’s sitting on the real PCI bus.

What about the checker? Well… you may not have one. After all, you won’t have one in the final system. How will you know if the final system works? Because… well, it works. If it doesn’t work, then that means that… it didn’t work. You know, blue screen of death, that sort of thing. (Hopefully while Bill is presenting… career-limiting maybe, but oh so worth it… the only tech story the grandkids will want to hear over and over…)

So, essentially, by this measure, emulation means setting your baby free, cutting the cord, sink or swim; you get the picture. If it sinks, then you’ve got problems. Oh, and yeah, then you need to find out what those problems are. And all of the big emulation systems have ways of capturing much more of the internal state than would be possible on a real system; for the large systems, that wealth of data can be examined in situ. If you wish, you could ship the state of the system back across to the host (oh, yeah, there’s that cord again… maybe it didn’t quite get completely cut…) for analysis or for comparison by simulation so that you can isolate what went awry, but, being the staunch macho isolationists that they are, the emulators say they can do that just fine without any help. (And no, they don’t need directions; they know exactly where they are.)

Note that, even though you’re running software on the emulator here, just like you do on a prototype, the intent is not to develop more software on a more-or-less known-good hardware model, but to verify that software and hardware work well together. If there’s a problem, it could be a software or hardware issue. Not until both work can it be called good.

So what does unification mean?

Having surveyed and mapped out the landscape, let’s move back to the opening gambit: the once-discrete realms of simulation, simulation acceleration, and emulation have now been united into a single flow. What does that mean?

It should mean two things:

•  control of verification comes from a single console running a single session regardless of where DUT or testbench are running

•  the, what should we call it, “locus of instantiation”? (yeah, good and pretentious… listen up for it at a keynote near you) of DUT and testbench should be manageable within that session.

This means that, from a single window, you could run your verification, moving things into and out of hardware at will. Cadence does claim to be able to move the DUT back and forth from hardware to software with a single command-line instruction. OK, they’re actually not moving them back and forth – there is a soft image and a hardware image of the DUT both in place, and the state moves back and forth between them. They call this hot swapping.

That connects simulation with simulation acceleration. What about connecting to emulation? That means being able to hot swap the testbench as well. Apparently the synthesizable part of the testbench can be swapped, as can I/O modules (models for actual) in some cases.

Here’s the other big question: let’s say you can do this all seamlessly. Is there a use model that benefits from being able to swap all this stuff around like this? Or is it enough to be able to start with simulation until that gets too encumbered, then move it to an accelerator until you think it’s good, then move it to emulation? One transition each time. Does moving back and forth have value?

Presumably, customers will be the ones to answer that.

More information:

Cadence Palladium XP

EVE Zebu

Mentor Veloce

Synopsys Confirma

*Full disclosure: this is a shameless riff on the name of a Discovery Channel program that unfortunately didn’t last too long… not to mention the title of last-year’s article… sorry… I promise no more links to that dang article…

11 thoughts on “From Simulation to Emulation”

  1. Pingback: over here
  2. Pingback: DMPK
  3. Pingback: zdporn
  4. Pingback: satta matka

Leave a Reply

featured blogs
Oct 5, 2022
The newest version of Fine Marine - Cadence's CFD software specifically designed for Marine Engineers and Naval Architects - is out now. Discover re-conceptualized wave generation, drastically expanding the range of waves and the accuracy of the modeling and advanced pos...
Oct 4, 2022
We share 6 key advantages of cloud-based IC hardware design tools, including enhanced scalability, security, and access to AI-enabled EDA tools. The post 6 Reasons to Leverage IC Hardware Development in the Cloud appeared first on From Silicon To Software....
Sep 30, 2022
When I wrote my book 'Bebop to the Boolean Boogie,' it was certainly not my intention to lead 6-year-old boys astray....

featured video

PCIe Gen5 x16 Running on the Achronix VectorPath Accelerator Card

Sponsored by Achronix

In this demo, Achronix engineers show the VectorPath Accelerator Card successfully linking up to a PCIe Gen5 x16 host and write data to and read data from GDDR6 memory. The VectorPath accelerator card featuring the Speedster7t FPGA is one of the first FPGAs that can natively support this interface within its PCIe subsystem. Speedster7t FPGAs offer a revolutionary new architecture that Achronix developed to address the highest performance data acceleration challenges.

Click here for more information about the VectorPath Accelerator Card

featured paper

Algorithm Verification with FPGAs and ASICs

Sponsored by MathWorks

Developing new FPGA and ASIC designs involves implementing new algorithms, which presents challenges for verification for algorithm developers, hardware designers, and verification engineers. This eBook explores different aspects of hardware design verification and how you can use MATLAB and Simulink to reduce development effort and improve the quality of end products.

Click here to read more

featured chalk talk

Clamping Down on Failure: Protecting 24 V Digital Outputs

Sponsored by Mouser Electronics and Skyworks

If you're designing IEC61131 compliant digital outputs for these PLCs or industrial controllers, you need to have a plan to protect these outputs from a variety of unknowns. In this episode of Chalk Talk, Amelia Dalton chats with Asa Kirby from Skyworks about an innovative new isolated smart switch device from Skyworks that gives you an unprecedented level of channel flexibility and protection, letting you offer customers a truly “set it and forget it” solution when it comes to your next PLC design.

Click here for more information about Skyworks Solutions Inc. Si834x Isolated Smart Switches