Embedded

November 10, 2011

Improving Embedded Software Integration

Combining Emulation and Offline Debugging

by Tomasz Piekarz and Joe Rodriguez (Mentor Graphics)

Today’s system-on-chip (SoC) designs are increasingly dependent on firmware and device drivers given the challenges of controlling various components (including the microcontroller, microprocessor or DSP cores, peripherals and interfaces). Accordingly, leading semiconductor companies are working to integrate software development and validation with silicon design and verification. One obstacle to such integration is the difficulty in effectively debugging early-stage embedded software. In this article we describe a way around this obstacle by way of a new software debugging methodology for software and system-level integration. When combined with traditional hardware emulation, the methodology reduces debug closure time and effort required to develop SoC firmware and device drivers.

Debugging software using emulation

Emulation has a solid performance record. Its clock speed is generally high enough to boot an operating system and then load and execute application-level software from a flash card. Emulators experience little performance drop-off even as the design grows. For this reason, at both the early and late stage of development, emulation can make sense for debugging embedded software.

Of course there is a catch. Today it’s possible to attach a software debugger via JTAG or parallel interfaces to the processor running in the hardware emulator. These methods work, though allocating emulation time can be impractical for embedded software teams, particularly in post-silicon environments. During many projects, the emulation queue is full with batch jobs scheduled to run more or less continuously. Offline debugging now makes it easier to add software-related batch jobs to this queue and then debug the results later offline. 

Here’s how it works: Imagine you are developing software for a new USB device. Your workflow might be to boot up the operating system, configure the hardware and run an application. Debugging this workflow takes time and is heavily dependent on the size of the operating system. A reasonable approach might be to start and configure the design until hitting a breakpoint, and then start debugging from there, though getting to the breakpoint may take time. Another rub: debugging usually is not done in one run since it takes multiple iterations to focus in on the problem. Nothing is more frustrating during debugging than being almost there, almost able to see the problem, only to make one step too many and have to start all over again.

For this example, assume it takes 20 minutes for the USB device software to run on the emulator while taking four hours to debug and fix the problem on the emulator. If debug could be taken off the emulator and done offline, then during the four hours spent diagnosing the problem, 12 other runs could be performed on the emulator or 12 other engineers could have access to the emulator. The bottom line: Offline debug allows your development team to more or less continually access to the emulator.  

Improving software debug

The combination of offline debugging and emulation creates a debug environment that connects to the database generated from executing the CPU code during the emulation run. Given the emulator’s speed, it’s entirely possible you’ll be looking at a large amount of source code. (Think here of booting an OS.) So it is important to have an environment that allows you to quickly pan through large swaths of code and identify where you want to look deeper. Offline debugging allows for stepping through the design forward or backwards at the high level source or the assembly level. The debugger displays the CPU registers as well as variables, memory contents and call stack view. It is fully synchronized with the hardware environment by connecting to the cursor in the waveform window. Stepping forward or backward updates all other displays in the debugger and moves the cursor in the waveform to the correct time when the data was sampled during the run. (The inverse holds true as well: by moving the cursor in the waveform, all the debugger views will update accordingly.) Figure 1 illustrates this using an example involving Mentor’s Veloce eumulator and debugging with Mentor’s Questa Codelink.

figure1a.png

Figure 1: Debug environment with Mentor Graphics Questa Codelink

Let’s look at how this environment can be used to debug a relatively common failure. The processor is executing code normally when a problem arises in the communication between the design’s software and the hardware. Perhaps the software is trying to get data from the uninitialized ASIC register and reads a corrupted value. When the software tries to perform some ALU operation based on this value, it freezes, producing a “flat line” in the hardware waveform (see Fig 2). To even start debugging what happened, the software engineer will have to understand:

  1. What was the software doing at the end of the run?
  2. What was the last good line of code executed?
  3. What caused the CPU to freeze

Let’s assume (hardly a stretch) that the software engineer is not familiar with the hardware verification environment. This means it is extremely difficult to correlate his software to what he sees happening in the hardware waveforms. Perhaps he’d opt to re-run the emulation with the debugger attached to the CPU. However, this would take time -- possibly as much as several hours to redo the whole run. What he really needs to do is to stop the CPU execution immediately before the problem is triggered to see what caused it. But how does he know when to stop just by looking at the hardware waveform?

Now, imagine you’re the engineer and you’ll use the offline debugging approach to address this problem. For starters, you don’t have to re-run the emulation because the tool already gathered all the data you need. You can start debugging the output from the emulator right away, starting at the failure and methodically moving backwards to find the cause. You also won’t have to work on the emulator since you can debug offline. Often bus activity ceases, which suggests a good place to investigate. At perhaps the most basic level, there are three questions, answers to which will lead you to the state of the CPU just before it failed.

What line of code was last executed in your test?

To find out, move the cursor to the last executed instruction and look at the source code. Below, that’s line number 135 in demo_diag.c file:

figure2b.png

Figure 2: Pinpointing the last line of code executed.

From where, in terms of source line number and function name, was the function called?

To answer this question, scroll up to see what function the code belongs to and then step backwards to the caller. Here, the function call is send_to_dbg_port and the caller is main.c line 411. In an environment like this, being able to step backwards is very important because it allows for efficiently starting at the place of failure and then tracing backwards to the cause.

What was the value of variable “p” in main() when the emulation stopped?

Moving the cursor and hovering it over the “p” variable shows the latest value: zero, in the example below.

figure3b.png

Figure : Mouse over variable “p” to show its value (zero as shown above) when emulation stopped.

Taking the debug process offline and allowing for replaying emulation brings many benefits. It not only presents a high level software debug environment familiar to embedded software engineers, but also keeps the emulator in use all the time.

Collecting data for offline debug on the emulator

Offline debugging is a two-step process:

  1. Run the test scenario on the emulator and produce the database needed for offline debugging.
  2. Launch the debugger on the database produced by the emulator.

Non-intrusiveness is one of main benefits of offline debugging. Generally using data generated by the emulator doesn’t require any additional hardware or design changes, thus preserving your system’s behavior. Logging can be done using a utility like Mentor’s TestBench Xpress, which can be attached to the design and compiled into the emulator. This emulated monitor sits outside the design and observes the pins and CPU register changes directly inside the CPU.

When working with Mentor Graphics Questa Codelink, to maintain emulation speed, the emulator-generated data is not the final Questa Codelink database but rather a raw data stream called Codelink Change List file. This file is later post-processed to create the final Questa Codelink replay log file that can be used to replay the emulation run. The final log file taken to the developer’s local machine is used for debugging, thus freeing up the emulator for other runs.

Once the database is created, it can be analyzed offline.

Multicore and multi CPU support

Another offline debug option is to simultaneously log multiple CPUs or cores. In either case, the process is exactly the same as previously described with one exception: one log file per core is generated. So, if there are two cores being logged in the emulator – a process that happens simultaneously – then two replay files (if you’re using Questa Codelink) will be generated. This is efficient since the files can then be analyzed individually. For example, consider an ARM design with two cores, each of which will run unique software written by a developer. That is, a different developer is responsible for each core and its associated software. Presumably, each developer would only be interested in debugging the CPU that he is working on, which the tools and workflow I’m describing in this article do allow.

During the debug session, the tool allows for viewing of multiple CPUs side by side. Each view is synchronized, which means that stepping in one core (and waveform window) will adjust the second core accordingly.

Multi domain debug environment

The same debug environment can be use in the logic simulator, thus extending the environment across different verification domains. This allows simulation and platform developers to more easily move between these domains. Why is this valuable? Consider logging a failure during a long duration simulation, the kind of simulation that provides interesting software execution. Engineers can use the same software debug environment in the higher performance hardware emulation domain without having to learn a new debug environment.

Conclusion

Offline debugging allows for a better, more flexible approach to debugging in general and can increase the number of tests you are able to run via emulation. The approach – logging the CPU activity during simulation in the hardware emulator and replaying it outside of the emulator – allows for the emulator to be constantly used for different runs or by different engineers. Offline debugging is nonintrusive and doesn’t require design changes. The approach preserves original design behavior and allows for logging and debugging multi-core and multi-CPU designs in one user-friendly environment.

Tomasz Piekarz (tomasz_piekarz@mentor.com) and Joe Rodriguez (joe_rodriquez@mentor.com) are both based in Wilsonville, Ore.

Channels

EDA. Embedded. Semiconductor.

 
    submit to reddit  

Comments:


bmoyer

Total Posts: 387
Joined: Dec 2009

Posted on November 10, 2011 at 1:03 PM

Do you end up jockeying with others for live machine time? How do you deal with it?

JoeRod

Total Posts: 1
Joined: Nov 2011

Posted on November 17, 2011 at 2:32 PM

Bmoyer,
Veloce is a network resource. You can schedule jobs with LSF or Sungrid. Hope this helps,
Joe
You must be logged in to leave a reply. Login »

Related Articles

From Cradle to Cloud

Education Meets High Tech

by Amelia Dalton

Who Controls the Power?

Open Power Foundation Aims to Make PowerPC More Plentiful

by Jim Turley

Once upon a time, there were many little RISC processors frolicking in the deep green microprocessor forest. There was the jaunty little ARM. The bright little SPARC. The mighty little MIPS. The aristocratic little PowerPC. And so many others. They all played and laughed and had ever such a good time.

Then, one by one, the happy little RISC processors started disappearing. Were they gobbled up by the big, bad CISC processor that lurked in the woods? Did they cross over the Wheatstone Bridge and into another land? Or did they just get lost in the tall grass, wandering aimlessly until their mommies and daddies forgot about them? ...

iWatch, You Speculate Incessantly

by Bruce Kleinman, FSVadvisors

I held out as long possible before writing anything iWatch related. The irony is that I am iFatigued with everyone iGuessing about an iUnnanounced product, and yet here I am contributing to the noise. iCaramba! The proverbial last straw: I read a piece comparing Microsofts unannounced wearable to Apples unannounced wearable. OMG.

And AFTER deciding to write this piecebut before I could startanother piece appeared with the declarative headline Heres Everything We Know About the iWatch. And because I cannot make up stuff this good, apparently the things we KNOW include:...

FPGA-Prototyping Simplified

Cadence Rolls New Protium Platform

by Kevin Morris

System on Chip (SoC) design today is an incredibly complicated collaborative endeavor. By applying the label System to the chips we design, we enter a...

An Irregular Street Scene

Plasma-Therm Proposes Plasma Dicing

by Bryon Moyer

A silicon wafer will always be patterned with a perfect grid of rectangular dice. Its so obvious that you even have to think about...

  • Feature Articles RSS
  • Comment on this article
  • Print this article

Login Required

In order to view this resource, you must log in to our site. Please sign in now.

If you don't already have an acount with us, registering is free and quick. Register now.

Sign In    Register