feature article
Subscribe Now

Making SoCs Easier to Debug

A New Technology Aims to Track Both Software and Hardware Bugs

System debugging used to be fairly straightforward. Components were on a board, linked by tracks, and, with a ‘scope and probes, you could look at signals and work out what was happening. Of course it didn’t seem so simple at the time – isn’t hindsight great? In time, systems got more complex, microcontrollers got more complex, and the companies building ‘scopes and other tools for hardware debugging came up with more and more sophisticated (which implies expensive) products. Digital ‘scopes, logic analysers and emulators all helped engineers in their efforts to keep up. JTAG was created to provide an interface – now frequently to a PC as well as to specialist tools – as multilayer boards hid tracks, and it was then used to provide visibility of operations within the chip. The JTAG interface is now also used for software debugging, as through JTAG it is possible to control program execution, stepping through line by line, or to set breakpoints. JTAG can also be used to program flash memory. Processor manufacturers started fairly early on to provide proprietary analysis tools, and ARM, for example, provides a range of interfaces and on-chip capabilities for advanced debugging and analysis.

But we live in the age of the System on Chip (SoC) and all bets are off. Even a relatively conservative SoC is complex and complicated. When you get SoCs with multiple processing units, CPUs, GPUs, DSPs, etc., and other IP blocks – often up to around 100 – life gets really interesting. Sure, JTAG can give you some information, but it is limited, and, compared to the communication speeds, on-chip JTAG is horrendously slow. It also needs dedicated pins on the SoC. ARM has CoreSight, but what if you don’t have an ARM core?

A big issue with SoC development is the hardware vs. software finger pointing.

“My hardware is fine – your software must be buggy.”

“My software is fine – it must be a hardware issue.”

When, in fact, the real problem may be caused by a subtle interaction between the hardware and software that is difficult to locate and resolve.

Solving problems in an SoC is a non-trivial matter. A software bug can be time-consuming to identify and fix. A hardware issue is also time- consuming to identify, and then, if a re-spin is needed, that takes even more time and huge amounts of cash. All of this in market places where time-to-market is vital and a delay can mean the difference between product success and abject failure. Even when things do get fixed, rumours and speculation can inflict damage.

Step forward, UltraSoC. This is a UK based company, commercialising research from the University of Kent, which provides UltraDebug IP and UltraAnalytics tools for SoC debugging and analysis.

The IP is a growing family of blocks that interface to the IP making up the SoC. There are already blocks for ARM, MIPS, and Imagination processors, for CEVA DSPs, for an array of buses, and for arbitrary logic blocks. These communicate through messaging across a coherent fabric within the SoC and then use JTAG or USB to talk to the outside world. USB was recently introduced and provides a much higher data rate than JTAG, and Ethernet and PCIe are on the way.

The outside world can be UltraSoC’s own Eclipse-based IDE or a test environment from a third party, like Teradyne Le Croy (a collaboration was announced in May 2015) or Lauterbach. Each block is non-intrusive and tailored to interface to its target; for example: for an ARM core, it can use the wide variety of debugging resources that ARM provides within its cores, with the designer choosing which to implement. Depending on the need, information from the target can be streamed live to the host, or it can be stored for later download, either in local memory within the block or in the main system memory.

From the IDE or from the test equipment, the user can set a block to monitor data or the execution of specific functions and carry out measurements. Depending on the target, the block can also control execution or traffic flow and carry out actions – like halting execution – if certain pre-set test conditions occur.

Since the blocks can also monitor bus traffic, it becomes possible to look for issues of communication between elements of the SoC, which are normally very hard to track down.

The way in which data can be displayed is highly flexible. For example, you could look for throughput, latency, worst case, and best case. You can see programme code at execution and generate histograms of data flow. And, of course, you can develop your own analyses.

The blocks are configurable at design time, providing different features depending on the function. The trade-off is essentially sophistication and multiple features against die area. Blocks can also be configured at system start-up, running only specific features rather than the whole block.

Now, this all takes real estate on the chip, something that may be an issue. According to UltraSoC, the overhead is around one percent or less, and this has been acceptable for some of their customers, while at least one other uses UltraSoC only for FPGA prototyping. But having the debugging capability in place for field deployment can be a positive thing. There are a number of possible use cases. For example, information from monitoring active devices in the field can be used as a basis for optimising software to improve things like power consumption, or improving performance by moving functions between different parts of the SoC. In a safety-critical implementation, the same optimisation and improvement criteria apply. But the monitoring information can also be used to create a “black box”, preserving information for later analysis should there be a system failure. A third, and related, area is high reliability applications. In communications there is often a commitment to “five nines” – that is, that a service will be available for 99.999% of the time (equivalent to 5.26 minutes downtime in a year). If an operator doesn’t achieve this, then there may be financial penalties. Even the slightest glitch will threaten the target, so monitoring and recording, again acting as a virtual black box, is a valuable resource to problem resolution and well worth the chip real estate.

So – a good academic idea and a company created to push it. We have seen this many times before, and not all of them have translated into commercial success. UltraSoC is shy about discussing company names: the customers aren’t too happy for it to be known they are adopting the technology, which they see as providing a competitive advantage, but the company claims that there are three devices already in volume manufacture and several more have been taped out.

With such complex IP, there is a need for strong customer support, and UltraSoC recognises this, with what the company feels is good documentation and a strong engineering team.

The senior management has recently been strengthened. Rupert Baines, who was appointed CEO in April this year, has a track record in developing technology companies, including Picochip (taken over by Mindspeed, in turn bought by Intel and M/A-COM). He also spent time with first:telecom, Arthur D Little, and Analog Devices, and, only a few days ago, Gadge Panesar joined as CTO. Gadge is an experienced architect, and, in addition to time in Academia, he has worked for Acorn computer, INMOS and STMicroelectronics, Picochip, Mindspeed, and, most recently, NVIDIA.

The company was at DAC and was kept busy running its demos (see http://www.ultrasoc.com/pre-and-post-silicon-debug-at-dac-2015/ for videos) and presenting a paper on design for analytics. (That is explained on the same page as the link above.)

Just as I was finishing drafting this article, the company announced a new collaboration, this time with Cadence to support Tensilica’s Xtensa family of customisable processors and DSPs.

The impression I get, and I hope I have managed to convey, is that UltraSoC is on the verge of really making a splash. If I were about to start work on designing an SoC (thank heavens I am not) I would have a really close look at the company.

Leave a Reply

featured blogs
Dec 2, 2020
The folks at LEVL are on a mission is to erase your network footprint. Paradoxically, they do this by generating their own 48-bit LEVL-IDs for your devices....
Dec 2, 2020
To arrive at your targeted and optimized PPA, you will need to execute several Innovus runs with a variety of design parameters, commands, and options. You will then need to analyze the data which... [[ Click on the title to access the full blog on the Cadence Community site...
Dec 1, 2020
UCLA’s Maxx Tepper gives us a brief overview of the Ocean High-Throughput processor to be used in the upgrade of the real-time event selection system of the CMS experiment at the CERN LHC (Large Hadron Collider). The board incorporates Samtec FireFly'„¢ optical cable ...
Nov 25, 2020
[From the last episode: We looked at what it takes to generate data that can be used to train machine-learning .] We take a break from learning how IoT technology works for one of our occasional posts on how IoT technology is used. In this case, we look at trucking fleet mana...

featured video

AI SoC Chats: Protecting Data with Security IP

Sponsored by Synopsys

Understand the threat profiles and security trends for AI SoC applications, including how laws and regulations are changing to protect the private information and data of users. Secure boot, secure debug, and secure communication for neural network engines is critical. Learn how DesignWare Security IP and Hardware Root of Trust can help designers create a secure enclave on the SoC and update software remotely.

Click here for more information about Security IP

featured paper

Tailor-made gateway processors lay the groundwork for zone architectures

Sponsored by Texas Instruments

Automotive suppliers and original equipment manufacturers are heavily investing software R&D efforts on adding new functions and features to achieve autonomy, electrification and connectivity. Still, enabling these functions by adding more electronic control units (ECUs) is not sustainable when it results in increased complexity and cost. There are two ways to consolidate and streamline ECUs within a vehicle…

Keep Reading

Featured Chalk Talk

General Port Protection

Sponsored by Mouser Electronics and Littelfuse

In today’s complex designs, port protection can be a challenge. High-speed data, low-speed data, and power ports need protection from ESD, power faults, and more. In this episode of Chalk Talk, Amelia Dalton chats with Todd Phillips from Littelfuse about port protection for your next system design.

Click here for more information about port protection from Littelfuse.