feature article
Subscribe Now

From Simulation to Emulation

It Takes Three – er – Four to Tangle

Last year I tried to wade through the world of emulation to untangle it a bit. It all seemed so simple at the time. Once I had it untangled, that is. Problem is, I only thought I had untangled it. Cadence recently announced a “unification” (there’s that word again) between simulation, simulation acceleration, and emulation. And it became pretty clear pretty quickly that, in the intervening year, a new tangled web has replaced the one I thought I had cleared out before.

Bottom line, I got confused. Again.

OK, perhaps that’s not such a rare occurrence, but work with me here…

As I started talking around, it felt like I wasn’t the only one confused. Although, in truth, you never think you’re confused until you realize that other people adhere to other beliefs with the same level of conviction that you adhere to yours. So everyone has more or less a clear opinion, it’s just that they don’t all align. And so I get confused. And so, in turn, I try my best to share my confusion with the others in an attempt to illustrate that there’s confusion.

And this time I didn’t get much pushback. Yup, folks: it’s confusing.

The real question here is, if these three items are being unified, what’s the distinction between them?

•  Mentor’s Jim Kenney suggests that using real I/O instead of modeled I/O distinguishes emulation from simulation acceleration. Also, FPGA use means prototype; Cadence and Mentor use custom ASICs for emulation.

•  OK… now we’ve tossed prototype systems into the mix too. Why untangle three things when four will do?

•  And EVE ostensibly makes emulators, but they use FPGAs. Does that make them prototypes? Jim also mentions prototype systems being small. But EVE also builds large units. So which is it?

•  EVE’s Lauro Rizzatti says the simulation acceleration/emulation distinction is pretty fuzzy. And that Cadence’s new marketing drawings look a lot like EVE’s old marketing drawings. And that Mentor’s custom ASIC is really a custom FPGA, while Cadence’s custom ASIC is based on logic processors (which we saw last year to be LUT-based and hence also somewhat FPGA-like).

•  Cadence’s Michael Young says that if the verification environment involves a SCE-MI interface between the host and the hardware unit, then it’s simulation acceleration, not emulation. But, with no SCE-MI interface, you can’t talk to a hosted testbench (with any kind of speed), and only a subset of the testbench can be synthesized into hardware.

•  Synopsys sells prototyping systems, not emulators, but they support an emulator use model.


Of course, not everything everyone says conflicts, and there are threads of agreement here and there. So let’s try to walk through it and sort out what’s what.

Prototype this!*

And let’s start with prototyping, since we can put that one to rest somewhat more easily. Both prototyping systems and emulation systems – not to mention virtual prototype systems – allow software to execute on a model of what a chip will ultimately look like. The operative distinction appears to be, why do you want to do that?

If the answer is, “So that I can verify that my hardware is correct,” then you’re looking at an emulator. If the answer is, “So that my software developers can get started writing software earlier,” then you’re talking prototype.

A hardware prototype assumes that the RTL is relatively stable and is in the process of being implemented in silicon. Software development can start in advance of the chip being built and tested. A more abstract virtual prototype can be used before the RTL is stable; after that, a hardware prototype provides much greater performance. In fact, it runs much faster than an emulator – like an order of magnitude or more faster.

Prototypes tend to be smaller, using fewer FPGAs than, say, an EVE Zebu or Mentor Veloce (or Cadence Palladium if you think of their chips as FPGA-like) system. But they can run faster because someone takes the time and effort to create a very efficient implementation of the hardware. It makes sense to spend that much effort only once you know the RTL is pretty stable. The FPGA compilation process, not to mention performance closure, takes time, and so the whole exercise of implementing the prototype takes much longer than could be tolerated for a check-out of the hardware.

If you’re still testing the hardware RTL, then you want to be able to turn design iterations rapidly. This typically means less efficient implementation, using more FPGAs and achieving less speed. That’s where emulation sits. It’s all in the tradeoff.

And then there were three

So that leaves simulation, simulation acceleration, and emulation. To try to put them together into some reasonable structure, let’s pull things apart to make sure we know what’s going on.

The verification environment really consists of three elements:

•   the design/device under test (DUT)

•  the stimulus

•  the checkers (or whatever decides that things did or didn’t work)

The stimulus and checkers are typically packaged together as the “testbench.” They’re typically written in Verilog or SystemVerilog and are generally not synthesizable as a whole (although a subset may be).

The DUT is typically written in either of the HDLs and is synthesizable. If it’s not, someone’s going to be in trouble at some point.

There are two places where each of these things can be:

•  the host, using software models

•  hardware

Let’s take the easy case as an example. Simulation takes place entirely in the host. So the DUT, the stimulus, and the checkers are all implemented as software models. No big mystery there. Problem is, they tend to run slowly. Especially when you start trying to see how your hardware works when executing software; it runs achingly slowly in a simulator. You could get old just waiting for the system to initialize; hopefully you’ve trained your kids by then to take over when the actual software runs.

And so you use hardware to accelerate things. Because the DUT is synthesizable and the testbench isn’t, the obvious first step is to move the DUT into hardware. In fact, you might even be tempted to move a part of the DUT into hardware. But we have an important consideration to take into account: the interface between the hardware and the simulator. Because, in this scenario, the simulator session still rules the land; it’s just including the hardware as one of its minions.

Back when this sort of thing was new, the stimulus in the host would stimulate the DUT in hardware one toggle at a time, and the response to the checkers would come back one toggle at a time. The DUT may have executed quickly, but getting signals to and from the DUT became the bottleneck.

Enter an approach pioneered by IKOS prior to their acquisition by Mentor: transactions instead of individual signal value changes. This approach lies at the heart of the SCE-MI interface connecting the host and the hardware DUT. This dramatically accelerates the testbench’s ability to control the DUT.

But what if you split the DUT up, accelerating only a portion of it? Well, assuming that portion has to talk to the other unaccelerated portion, now you’re back to having to connect those signals across the bundle of wires between the host and the hardware. And there’s nothing to pack the signal changes into transactions. So this takes us back to achingly slow.

Meaning that the DUT will pretty much be found in its entirety either on the host or in hardware. The latter being the accelerated version.

So where does that leave emulation? Well, we still have this pesky tether to the host. Both Cadence and Mentor seem to agree that, in their view, an emulator stands on its own. It doesn’t talk to a host. There are two ways to achieve that: either put as much of the testbench as possible into hardware, or don’t use a Verilog testbench – use a “real” testbench.

Let’s break the testbench back apart into stimulus and checker portions. To stimulate your DUT (illegal in some states), you could have some signals coming from a synthesized partial-testbench, or, even better, use actual I/O cards that look like what will actually happen in the final system. If it’s network traffic you need, hook the DUT up to a real network through a real network card rather than using a simulated traffic generator. Need to talk to something that sits on the PCI bus? Use a real PCI card to talk to the real thing that’s sitting on the real PCI bus.

What about the checker? Well… you may not have one. After all, you won’t have one in the final system. How will you know if the final system works? Because… well, it works. If it doesn’t work, then that means that… it didn’t work. You know, blue screen of death, that sort of thing. (Hopefully while Bill is presenting… career-limiting maybe, but oh so worth it… the only tech story the grandkids will want to hear over and over…)

So, essentially, by this measure, emulation means setting your baby free, cutting the cord, sink or swim; you get the picture. If it sinks, then you’ve got problems. Oh, and yeah, then you need to find out what those problems are. And all of the big emulation systems have ways of capturing much more of the internal state than would be possible on a real system; for the large systems, that wealth of data can be examined in situ. If you wish, you could ship the state of the system back across to the host (oh, yeah, there’s that cord again… maybe it didn’t quite get completely cut…) for analysis or for comparison by simulation so that you can isolate what went awry, but, being the staunch macho isolationists that they are, the emulators say they can do that just fine without any help. (And no, they don’t need directions; they know exactly where they are.)

Note that, even though you’re running software on the emulator here, just like you do on a prototype, the intent is not to develop more software on a more-or-less known-good hardware model, but to verify that software and hardware work well together. If there’s a problem, it could be a software or hardware issue. Not until both work can it be called good.

So what does unification mean?

Having surveyed and mapped out the landscape, let’s move back to the opening gambit: the once-discrete realms of simulation, simulation acceleration, and emulation have now been united into a single flow. What does that mean?

It should mean two things:

•  control of verification comes from a single console running a single session regardless of where DUT or testbench are running

•  the, what should we call it, “locus of instantiation”? (yeah, good and pretentious… listen up for it at a keynote near you) of DUT and testbench should be manageable within that session.

This means that, from a single window, you could run your verification, moving things into and out of hardware at will. Cadence does claim to be able to move the DUT back and forth from hardware to software with a single command-line instruction. OK, they’re actually not moving them back and forth – there is a soft image and a hardware image of the DUT both in place, and the state moves back and forth between them. They call this hot swapping.

That connects simulation with simulation acceleration. What about connecting to emulation? That means being able to hot swap the testbench as well. Apparently the synthesizable part of the testbench can be swapped, as can I/O modules (models for actual) in some cases.

Here’s the other big question: let’s say you can do this all seamlessly. Is there a use model that benefits from being able to swap all this stuff around like this? Or is it enough to be able to start with simulation until that gets too encumbered, then move it to an accelerator until you think it’s good, then move it to emulation? One transition each time. Does moving back and forth have value?

Presumably, customers will be the ones to answer that.

More information:

Cadence Palladium XP

EVE Zebu

Mentor Veloce

Synopsys Confirma

*Full disclosure: this is a shameless riff on the name of a Discovery Channel program that unfortunately didn’t last too long… not to mention the title of last-year’s article… sorry… I promise no more links to that dang article…

11 thoughts on “From Simulation to Emulation”

  1. Pingback: over here
  2. Pingback: DMPK
  3. Pingback: zdporn
  4. Pingback: satta matka

Leave a Reply

featured blogs
Dec 3, 2021
Believe it or not, I ran into John (he told me I could call him that) at a small café just a couple of evenings ago as I pen these words....
Dec 3, 2021
The annual Design Automation Conference (DAC) is coming up December 5th to 9th, next week. It is in-person in San Francisco's Moscone Center West. It will be available virtually from December... [[ Click on the title to access the full blog on the Cadence Community site...
Dec 1, 2021
We discuss semiconductor lithography and the importance of women in engineering with Mariya Braylovska, Director of R&D for Custom Design & Manufacturing. The post Q&A with Mariya Braylovska, R&D Director, on the Joy of Solving Technical Challenges with a...
Nov 8, 2021
Intel® FPGA Technology Day (IFTD) is a free four-day event that will be hosted virtually across the globe in North America, China, Japan, EMEA, and Asia Pacific from December 6-9, 2021. The theme of IFTD 2021 is 'Accelerating a Smart and Connected World.' This virtual event ...

featured video

Synopsys & Samtec Demo PCIe 6.0 IP, Connector & Cable Systems for AI Hardware Designs

Sponsored by Synopsys

This demo features Synopsys’ DesignWare PHY IP for PCIe 6.0, performing at maximum channel loss, with Samtec's connectors in a configurable, GPU-based AI/ML system.

Click here for more information about DesignWare IP for PCI Express (PCIe) 6.0

featured paper

Using the MAX66242 Mobile Application, the Basics

Sponsored by Analog Devices

This application note describes the basics of the near-field communication (NFC)/radio frequency identification (RFID) MAX66242EVKIT board and gives an application utilizing the NFC capabilities of iOS and Android® based mobile devices to exercise board functionality. It then demonstrates how the application enables use of memory and secure features in the MAX66242. It also shows how to use the MAX66242 with an onboard I2C temperature sensor, demonstrating the device's energy harvesting feature.

Click to read more

featured chalk talk

FPGAs Advance Data Acceleration in the Digital Transformation Age

Sponsored by Achronix

Acceleration is becoming a critical technology for today’s data-intensive world. Conventional processors cannot keep up with the demands of AI and other performance-intensive workloads, and engineering teams are looking to acceleration technologies for leverage against the deluge of data. In this episode of Chalk Talk, Amelia Dalton chats with Tom Spencer of Achronix about the current revolution in acceleration technology, and about specific solutions from Achronix that take advantage of leading-edge FPGAs, design IP, and even plug-and-play accelerator cards to address a wide range of challenges.

Click here for more information