feature article
Subscribe Now

Improving Embedded Software Integration

Combining Emulation and Offline Debugging

Today’s system-on-chip (SoC) designs are increasingly dependent on firmware and device drivers given the challenges of controlling various components (including the microcontroller, microprocessor or DSP cores, peripherals and interfaces). Accordingly, leading semiconductor companies are working to integrate software development and validation with silicon design and verification. One obstacle to such integration is the difficulty in effectively debugging early-stage embedded software. In this article we describe a way around this obstacle by way of a new software debugging methodology for software and system-level integration. When combined with traditional hardware emulation, the methodology reduces debug closure time and effort required to develop SoC firmware and device drivers.

Debugging software using emulation

Emulation has a solid performance record. Its clock speed is generally high enough to boot an operating system and then load and execute application-level software from a flash card. Emulators experience little performance drop-off even as the design grows. For this reason, at both the early and late stage of development, emulation can make sense for debugging embedded software.

Of course there is a catch. Today it’s possible to attach a software debugger via JTAG or parallel interfaces to the processor running in the hardware emulator. These methods work, though allocating emulation time can be impractical for embedded software teams, particularly in post-silicon environments. During many projects, the emulation queue is full with batch jobs scheduled to run more or less continuously. Offline debugging now makes it easier to add software-related batch jobs to this queue and then debug the results later offline. 

Here’s how it works: Imagine you are developing software for a new USB device. Your workflow might be to boot up the operating system, configure the hardware and run an application. Debugging this workflow takes time and is heavily dependent on the size of the operating system. A reasonable approach might be to start and configure the design until hitting a breakpoint, and then start debugging from there, though getting to the breakpoint may take time. Another rub: debugging usually is not done in one run since it takes multiple iterations to focus in on the problem. Nothing is more frustrating during debugging than being almost there, almost able to see the problem, only to make one step too many and have to start all over again.

For this example, assume it takes 20 minutes for the USB device software to run on the emulator while taking four hours to debug and fix the problem on the emulator. If debug could be taken off the emulator and done offline, then during the four hours spent diagnosing the problem, 12 other runs could be performed on the emulator or 12 other engineers could have access to the emulator. The bottom line: Offline debug allows your development team to more or less continually access to the emulator.  

Improving software debug

The combination of offline debugging and emulation creates a debug environment that connects to the database generated from executing the CPU code during the emulation run. Given the emulator’s speed, it’s entirely possible you’ll be looking at a large amount of source code. (Think here of booting an OS.) So it is important to have an environment that allows you to quickly pan through large swaths of code and identify where you want to look deeper. Offline debugging allows for stepping through the design forward or backwards at the high level source or the assembly level. The debugger displays the CPU registers as well as variables, memory contents and call stack view. It is fully synchronized with the hardware environment by connecting to the cursor in the waveform window. Stepping forward or backward updates all other displays in the debugger and moves the cursor in the waveform to the correct time when the data was sampled during the run. (The inverse holds true as well: by moving the cursor in the waveform, all the debugger views will update accordingly.) Figure 1 illustrates this using an example involving Mentor’s Veloce eumulator and debugging with Mentor’s Questa Codelink.


Figure 1: Debug environment with Mentor Graphics Questa Codelink

Let’s look at how this environment can be used to debug a relatively common failure. The processor is executing code normally when a problem arises in the communication between the design’s software and the hardware. Perhaps the software is trying to get data from the uninitialized ASIC register and reads a corrupted value. When the software tries to perform some ALU operation based on this value, it freezes, producing a “flat line” in the hardware waveform (see Fig 2). To even start debugging what happened, the software engineer will have to understand:

  1. What was the software doing at the end of the run?
  2. What was the last good line of code executed?
  3. What caused the CPU to freeze

Let’s assume (hardly a stretch) that the software engineer is not familiar with the hardware verification environment. This means it is extremely difficult to correlate his software to what he sees happening in the hardware waveforms. Perhaps he’d opt to re-run the emulation with the debugger attached to the CPU. However, this would take time — possibly as much as several hours to redo the whole run. What he really needs to do is to stop the CPU execution immediately before the problem is triggered to see what caused it. But how does he know when to stop just by looking at the hardware waveform?

Now, imagine you’re the engineer and you’ll use the offline debugging approach to address this problem. For starters, you don’t have to re-run the emulation because the tool already gathered all the data you need. You can start debugging the output from the emulator right away, starting at the failure and methodically moving backwards to find the cause. You also won’t have to work on the emulator since you can debug offline. Often bus activity ceases, which suggests a good place to investigate. At perhaps the most basic level, there are three questions, answers to which will lead you to the state of the CPU just before it failed.

What line of code was last executed in your test?

To find out, move the cursor to the last executed instruction and look at the source code. Below, that’s line number 135 in demo_diag.c file:


Figure 2: Pinpointing the last line of code executed.

From where, in terms of source line number and function name, was the function called?

To answer this question, scroll up to see what function the code belongs to and then step backwards to the caller. Here, the function call is send_to_dbg_port and the caller is main.c line 411. In an environment like this, being able to step backwards is very important because it allows for efficiently starting at the place of failure and then tracing backwards to the cause.

What was the value of variable “p” in main() when the emulation stopped?

Moving the cursor and hovering it over the “p” variable shows the latest value: zero, in the example below.


Figure : Mouse over variable “p” to show its value (zero as shown above) when emulation stopped.

Taking the debug process offline and allowing for replaying emulation brings many benefits. It not only presents a high level software debug environment familiar to embedded software engineers, but also keeps the emulator in use all the time.

Collecting data for offline debug on the emulator

Offline debugging is a two-step process:

  1. Run the test scenario on the emulator and produce the database needed for offline debugging.
  2. Launch the debugger on the database produced by the emulator.

Non-intrusiveness is one of main benefits of offline debugging. Generally using data generated by the emulator doesn’t require any additional hardware or design changes, thus preserving your system’s behavior. Logging can be done using a utility like Mentor’s TestBench Xpress, which can be attached to the design and compiled into the emulator. This emulated monitor sits outside the design and observes the pins and CPU register changes directly inside the CPU.

When working with Mentor Graphics Questa Codelink, to maintain emulation speed, the emulator-generated data is not the final Questa Codelink database but rather a raw data stream called Codelink Change List file. This file is later post-processed to create the final Questa Codelink replay log file that can be used to replay the emulation run. The final log file taken to the developer’s local machine is used for debugging, thus freeing up the emulator for other runs.

Once the database is created, it can be analyzed offline.

Multicore and multi CPU support

Another offline debug option is to simultaneously log multiple CPUs or cores. In either case, the process is exactly the same as previously described with one exception: one log file per core is generated. So, if there are two cores being logged in the emulator – a process that happens simultaneously – then two replay files (if you’re using Questa Codelink) will be generated. This is efficient since the files can then be analyzed individually. For example, consider an ARM design with two cores, each of which will run unique software written by a developer. That is, a different developer is responsible for each core and its associated software. Presumably, each developer would only be interested in debugging the CPU that he is working on, which the tools and workflow I’m describing in this article do allow.

During the debug session, the tool allows for viewing of multiple CPUs side by side. Each view is synchronized, which means that stepping in one core (and waveform window) will adjust the second core accordingly.

Multi domain debug environment

The same debug environment can be use in the logic simulator, thus extending the environment across different verification domains. This allows simulation and platform developers to more easily move between these domains. Why is this valuable? Consider logging a failure during a long duration simulation, the kind of simulation that provides interesting software execution. Engineers can use the same software debug environment in the higher performance hardware emulation domain without having to learn a new debug environment.


Offline debugging allows for a better, more flexible approach to debugging in general and can increase the number of tests you are able to run via emulation. The approach – logging the CPU activity during simulation in the hardware emulator and replaying it outside of the emulator – allows for the emulator to be constantly used for different runs or by different engineers. Offline debugging is nonintrusive and doesn’t require design changes. The approach preserves original design behavior and allows for logging and debugging multi-core and multi-CPU designs in one user-friendly environment.

Tomasz Piekarz (tomasz_piekarz@mentor.com) and Joe Rodriguez (joe_rodriquez@mentor.com) are both based in Wilsonville, Ore.

12 thoughts on “Improving Embedded Software Integration”

  1. Pingback: GVK Biosciences
  2. Pingback: Funny cats
  3. Pingback: useful reference
  4. Pingback: ADME
  5. Pingback: DMPK Biology Lab

Leave a Reply

featured blogs
Dec 6, 2023
Optimizing a silicon chip at the system level is crucial in achieving peak performance, efficiency, and system reliability. As Moore's Law faces diminishing returns, simply transitioning to the latest process node no longer guarantees substantial power, performance, or c...
Dec 6, 2023
Explore standards development and functional safety requirements with Jyotika Athavale, IEEE senior member and Senior Director of Silicon Lifecycle Management.The post Q&A With Jyotika Athavale, IEEE Champion, on Advancing Standards Development Worldwide appeared first ...
Nov 6, 2023
Suffice it to say that everyone and everything in these images was shot in-camera underwater, and that the results truly are haunting....

featured video

Dramatically Improve PPA and Productivity with Generative AI

Sponsored by Cadence Design Systems

Discover how you can quickly optimize flows for many blocks concurrently and use that knowledge for your next design. The Cadence Cerebrus Intelligent Chip Explorer is a revolutionary, AI-driven, automated approach to chip design flow optimization. Block engineers specify the design goals, and generative AI features within Cadence Cerebrus Explorer will intelligently optimize the design to meet the power, performance, and area (PPA) goals in a completely automated way.

Click here for more information

featured webinar

Rapid Learning: Purpose-Built MCU Software Tools for Data-Driven Embedded IoT Systems

Sponsored by ITTIA

Are you developing an MCU application that captures data of all kinds (metrics, events, logs, traces, etc.)? Are you ready to reduce the difficulties and complications involved in developing an event- and data-centric embedded system? This webinar will quickly introduce you to excellent MCU-specific software options for developing your next-generation data-driven IoT systems. You will also learn how to recognize and overcome data management obstacles. Register today as seats are limited!

Register Now!

featured chalk talk

Addressing the Challenges of Low-Latency, High-Performance Wi-Fi
In this episode of Chalk Talk, Amelia Dalton, Andrew Hart from Infineon, and Andy Ross from Laird Connectivity examine the benefits of Wi-Fi 6 and 6E, why IIoT designs are perfectly suited for Wi-Fi 6 and 6E, and how Wi-Fi 6 and 6E will bring Wi-Fi connectivity to a broad range of new applications.
Nov 17, 2023