feature article
Subscribe Now

Two Cores When One Won’t Do

Synopsys Announces Dual-Core Module for ASIL-D

Do you trust your processor?

Yeah, you’re right; that’s not a fair question. If the question is reworded as, “Will your processor always give the correct result?” then the obvious comeback is, “Correct according to what?” If there’s a bug in the software, then the processor will give the correct – but not the desired – result.

So let’s assume good software. Now will the processor always give the correct – and desired – response?

Well, what if there’s a bug in the hardware? Of course, many of you reading this may well be deep in the throes of making sure that’s not going to be the case on your processor. As with software, it’s hard to guarantee that hardware has zero bugs. But, unlike software, great gobs of money and effort are expended on taking the bug count asymptotically close to zero.

So if we assume a good job has been done on verifying the processor, then can we (please) trust our processor?

Yes. Well… maybe. What are you doing with it? If you’re liking a dog picture on social media, then sure. But if my life depends on it? If it runs my car and is the main thing determining whether or not I become a roadside casualty, then… maybe not so much.

Even if the processor design team had truly discovered and resolved all issues, some of those issues aren’t binary. In particular, performance issues are verified to some level of confidence. It’s never 100%. Yeah, you can embrace 6?, but what if some unlikely condition out at 7? occurs?

Then there are uncontrollables like alpha particles. Or silicon wear-out, or walking-wounded issues that manifest later. Some of these may be temporary, some fatal.

So the most tech-naïve of us knows that we can’t count 100% on simple technology all the time, and we make allowances when that web page doesn’t come up the first time or when our call drops.

Running the Critical Parts of the Car

There’s an old set of jokes about what it would be like if cars were run by Windows. Things like, every now and then having to pull over to the side of the road, shut the engine down, and restart it – for no particular reason. That’s all funny – until you realize that upcoming self-driving cars are going to feature technology, nominally of the same sort that occasionally features a blue screen (whether or not branded by Microsoft).

So if we can’t 100% guarantee outcomes for so-called safety-critical operations – circuits in planes and trains and automobiles and medical devices and nuclear power plants – then how can we trust that those circuits won’t be our undoing?

In the automotive world, the ISO standard 26262 lays out expectations for different sorts of functions according to how likely they are to happen, how much control the driver has, and what the consequences of failure would be. These are given ASIL ratings: A (of least concern) to D (stuff better work or people could die).

So, out at that ASIL-D level, what do you do?

This concern has long been a factor in the mil/aero industries, where planes need to stay aloft and munitions must not deviate from their trajectories. One of the solutions there is referred to as “triple-module redundancy” (TMR). This idea, oversimplified, makes the assumption that, by tripling up the computing at critical nodes, if one processor has an issue (low probability if designed well), then the other two are even less likely to have the same issue. So in the event that all three processors don’t agree, a two-out-of-three vote settles the argument. Democracy in action!

This works – for a price. In that market, prices are indeed higher to support this extra cost burden (and many others). The same can’t be said, however, for the automotive market. Lives are still at stake, but shaving costs is critical. In this case, there’s a different way of handling processor failure. It still involves redundancy, but less than TMR.

The automotive approach is to use two instead of three processors. And, instead of three processors without hierarchy, the dual-core approach has a main processor and a shadow processor that acts as a double-check. Synopsys has announced a dual-core module targeting ASIL-D applications, referring to their instances in a circuit as “safety islands.”

 Diagram_Synopsys_ASIL_D_Ready_Dual-Core_Lockstep_Processor_IP_FINAL.JPG

(Image courtesy Synopsys)

The idea here is that the main core has primacy, but it’s got this shadow core looking over its shoulder. If the shadow doesn’t agree with a result that the main core produces, it alerts. What happens then depends on the application; think of it as throwing an exception, and the code has to determine the error handler. Except that, this being hardware, there are several options for manifesting a (hopefully) graceful exit from the state of grace.

When such a disagreement occurs, a two-bit error signal is set – and remains set until specifically reset. The state of the cores is also frozen for forensic or debug purposes. For recovery, you get three options: reset the core; interrupt the core; or send a message to a host processor. Synopsys sees the first two as most likely, since trust in the main core is now compromised (even though it’s theoretically possible that it could be the shadow core that glitched).

Simple in Principle, But…

So far, so good. But… what happens if some event occurs – a power glitch, an alpha particle, whatever – that affects both processors? As circuits get smaller, even localized events start to affect more circuitry at the same time. If that happens, the main core might generate an incorrect result – and the supervisor, still reeling from the same event, might go along with it. Not a good thing at 70 mph.

So the module includes a notion called “time diversity” – the shadow core does what the main core does, only one or two clock cycles later. (The specific number of cycles is programmable.) This makes it much less likely that something affecting the main core will affect the shadow core equally.

This is done with a FIFO in the safety monitor; the main core’s inputs and result are pushed into the FIFO so that it can be compared at a (slightly) later time with the shadow core’s outcome. This comparison is done for each clock cycle.

Which raises a new question: what is a “result”? Some instructions take more than one cycle to complete; what’s the intermediate result? Some instructions perform a calculation, in which case there is a specific result. But others might store data into memory – what exactly is the result there? Do you then go test whether the data truly ended up in memory? Does the shadow core do a test-and-store if the to-be-stored values disagree?

There are a couple of pieces to the answers. First, you can’t have results with definitions that vary according to the application; that’s just crazy-making. Instead, there’s some subset of the internal state that gets compared. That then works for each clock cycle, regardless of the specific instruction.

The other piece is that the shadow core can read from memory, but it can’t write to it. It’s not there to “do” anything; it simply supervises, tattling when there’s an issue.

Synopsys says that dual-core processors aren’t a new thing, but most are higher performance. They say that their ARC-based dual-core module – intended specifically for ASIL-D usage – is the first one in the microcontroller range.

All of this effort so that, when you’re cruising down the coast, hair blowing all over, magical tunes blaring from your speakers, and your car doing all the work automatically, you won’t have to think about your processors. You’ll just trust them.

More info:

Synopsys ARC Safety-Island IP

One thought on “Two Cores When One Won’t Do”

Leave a Reply

featured blogs
Sep 30, 2022
When I wrote my book 'Bebop to the Boolean Boogie,' it was certainly not my intention to lead 6-year-old boys astray....
Sep 30, 2022
Wow, September has flown by. It's already the last Friday of the month, the last day of the month in fact, and so time for a monthly update. Kaufman Award The 2022 Kaufman Award honors Giovanni (Nanni) De Micheli of École Polytechnique Fédérale de Lausanne...
Sep 29, 2022
We explain how silicon photonics uses CMOS manufacturing to create photonic integrated circuits (PICs), solid state LiDAR sensors, integrated lasers, and more. The post What You Need to Know About Silicon Photonics appeared first on From Silicon To Software....

featured video

PCIe Gen5 x16 Running on the Achronix VectorPath Accelerator Card

Sponsored by Achronix

In this demo, Achronix engineers show the VectorPath Accelerator Card successfully linking up to a PCIe Gen5 x16 host and write data to and read data from GDDR6 memory. The VectorPath accelerator card featuring the Speedster7t FPGA is one of the first FPGAs that can natively support this interface within its PCIe subsystem. Speedster7t FPGAs offer a revolutionary new architecture that Achronix developed to address the highest performance data acceleration challenges.

Click here for more information about the VectorPath Accelerator Card

featured paper

Algorithm Verification with FPGAs and ASICs

Sponsored by MathWorks

Developing new FPGA and ASIC designs involves implementing new algorithms, which presents challenges for verification for algorithm developers, hardware designers, and verification engineers. This eBook explores different aspects of hardware design verification and how you can use MATLAB and Simulink to reduce development effort and improve the quality of end products.

Click here to read more

featured chalk talk

ActiveCiPS™: Configurable Intelligent Power Management Solutions

Sponsored by Mouser Electronics and Qorvo

Programmable power management can not only help us manage our power systems but it can also have size, weight, and cost benefits as well. In this episode of Chalk Talk, Amelia Dalton chats with Yael Coleman from Qorvo about the system-wide benefits of configurable power management solutions. They investigate the programmable features of the ActiveCips configurable intelligent power management solutions and review how these solutions can help you balance weight, size, power and cost in your next design.

Click here for more information about Qorvo ACT41000 Low Noise DC-to-DC Buck Converter