feature article
Subscribe Now

Certifying the Certifier

OneSpin Talks About the Extra Burden of Proof

OK, people: it’s time to talk again about how not to hurt or kill people (or other living things) with electronic gadgetry. Or more-than-gadgetry, like cars that have the temerity to drive themselves. There are so many angles from which to approach how all those functions in such machines can be made safe; we take on yet another one today.

This discussion stems from a conversation with OneSpin at this summer’s DAC. Seems like it was just about this time last year that we talked about how EDA and functional safety work together, but, based on some recent certification announcements, this year we have a view from a different stance.

Can I Prove that my System is Safe?

With specs like ISO 26262 and DO-254, a big burden goes onto you developers to show that what you’ve created is safe. And that includes proving that the tools you relied on are also deemed safe. So, for EDA companies, there are two important aspects of certification: helping their customers to get certified and getting their own tools certified. The latter, of course, helps with the former.

When we talk about helping you to get certified, that tools-certification thing is only part of the answer. There are elements that you need to prove yourself, and tools can assist with that. Information can also help – like safety kits and safety manuals. The latter specify the safe way to use… some thing. Metrics that prove safety apply only if the safety manual is followed.

There are three metrics that matter, but there’s a notion that we need to tease out before we look at them. That notion gets to the difference between failures and latent failures. Note that these metrics assume a single failure at a time.

A basic failure is pretty much what you think: something in the functional behavior of a device that doesn’t work as expected. We put in detection circuits to find such failures so that, if the failure occurs, we can move the system into a protected, safe state.

But what if the detection circuit fails? That’s considered a latent fault. It’s not going to be evident until a basic failure happens and isn’t detected. With that, then, the three metrics OneSpin described are:

  • Single-point faults
  • Latent faults
  • Probabilistic Metric for (random) Hardware Failures (PMHF); this is essentially the failures-in-time (FIT) rate.

There are tools that help with this, but, according to OneSpin, none works at the chip level, or even close to it. That means trying to get the numbers for the different pieces of the design and then somehow cobbling them together as a representation of the whole chip. OneSpin has now taken this on, however, and some of the tools can work at the chip level – even at gate-level.

But What About the Tools?

That’s all great, but if you’re relying on such tools to show that you’re safe, then you need to know for sure that the tools proving your safety are correct and, by extension, safe. Doing so requires answering a couple questions:

  • What is the tool impact? Can the tool inject faults? Can it fail to detect faults?
  • Based on the first question, what is the probability of detecting an error in the tool?

With respect to that latter one, there are three levels of ways to certify: TCL1, TCL2, and TCL3 (TCL is tool confidence level). TCL1 is for those lucky tools that have backups. For instance, if synthesis makes a mistake, there is equivalency checking that can find the failure, so, as long as you run the equivalency check, you can work with a less-than-perfect synthesis. And so you don’t have to do a tool qual.

But what about that equivalency checker? OK, maybe you have a tool that tests it. But, if you do, then how do you prove that that tool works? There’s always some last tool in the proof chain, and those last tools are what every other certification relies on. And OneSpin tends to find itself at the end of the chain, meaning TCL2 or TCL3.

What’s the difference between them? Well, that depends on the confidence in the tool’s error detection level, TD (tool detection). TD1 means high confidence; TD2 is medium confidence, and TD3 is low confidence. So you need TCL2 for a TD2 tool and TCL3 for a TD3 tool.

There are four possible ways to attain TCL2 or TCL3 certification. The ways are the same for the two confidence levels; the difference lies in which ones are recommended vs. highly recommended for the various ASIL levels you might be going for.

The four possibilities are:

  • 1a: Get confidence from use. For EDA tools, it’s hard to get this info from customers – and you have to completely redo it with each new release.
  • 1b: Qualify the tool development flow.
  • 1c: Test the tools rigorously.
  • 1d: Develop the tool per the safety standard – which, OneSpin says, is a huge burden.

OneSpin did both 1b and 1c, announcing their certification by the European TÜV SÜD organization. That carries them until their next revision. Yes, we did say that, with 1a, each new revision requires a do-over. But that’s apparently not the case for 1b and 1c; incremental changes to the tool are easier to manage than the initial qual. So they will be updating the qual, but it won’t be quite so burdensome as what they’ve done so far.

The idea, then, is that they also have a safety kit – which includes the safety manual – so that customers can cite the kit (along with proving that they’ve followed the safety manual) to check off the tool certification requirement. They’ve covered both TCL2 and TCL3 – TCL2 on all tools and TCL3 on a customer-request basis.

Doing these certifications for ISO 26262 also helps them with other standards. There may be subtle differences, but much of the work can satisfy the needs of more than just one standard. For instance, their TÜV certification and resulting FPGA qualification kit cover ISO 26262, IEC 61508, and EN 50128. They also announced certification for DO-254.

While all EDA companies have this challenge ahead of (or behind) them, companies like OneSpin, which have tools at the end of the certification chain, bear the greatest burden of proof, since they’re proving everyone else. So you’ll probably see announcements of a similar nature for all of them.

 

More info:

OneSpin

One thought on “Certifying the Certifier”

Leave a Reply

featured blogs
Apr 11, 2021
https://youtu.be/D29rGqkkf80 Made in "Hawaii" (camera Ziyue Zhang) Monday: Dynamic Duo 2: The Sequel Tuesday: Gall's Law and Big Ball of Mud Wednesday: Benedict Evans on Tech in 2021... [[ Click on the title to access the full blog on the Cadence Community sit...
Apr 8, 2021
We all know the widespread havoc that Covid-19 wreaked in 2020. While the electronics industry in general, and connectors in particular, took an initial hit, the industry rebounded in the second half of 2020 and is rolling into 2021. Travel came to an almost stand-still in 20...
Apr 7, 2021
We explore how EDA tools enable hyper-convergent IC designs, supporting the PPA and yield targets required by advanced 3DICs and SoCs used in AI and HPC. The post Why Hyper-Convergent Chip Designs Call for a New Approach to Circuit Simulation appeared first on From Silicon T...
Apr 5, 2021
Back in November 2019, just a few short months before we all began an enforced… The post Collaboration and innovation thrive on diversity appeared first on Design with Calibre....

featured video

The Verification World We Know is About to be Revolutionized

Sponsored by Cadence Design Systems

Designs and software are growing in complexity. With verification, you need the right tool at the right time. Cadence® Palladium® Z2 emulation and Protium™ X2 prototyping dynamic duo address challenges of advanced applications from mobile to consumer and hyperscale computing. With a seamlessly integrated flow, unified debug, common interfaces, and testbench content across the systems, the dynamic duo offers rapid design migration and testing from emulation to prototyping. See them in action.

Click here for more information

featured paper

Understanding Functional Safety FIT Base Failure Rate Estimates per IEC 62380 and SN 29500

Sponsored by Texas Instruments

Functional safety standards such as IEC 61508 and ISO 26262 require semiconductor device manufacturers to address both systematic and random hardware failures. Base failure rates (BFR) quantify the intrinsic reliability of the semiconductor component while operating under normal environmental conditions. Download our white paper which focuses on two widely accepted techniques to estimate the BFR for semiconductor components; estimates per IEC Technical Report 62380 and SN 29500 respectively.

Click here to download the whitepaper

Featured Chalk Talk

General Port Protection

Sponsored by Mouser Electronics and Littelfuse

In today’s complex designs, port protection can be a challenge. High-speed data, low-speed data, and power ports need protection from ESD, power faults, and more. In this episode of Chalk Talk, Amelia Dalton chats with Todd Phillips from Littelfuse about port protection for your next system design.

Click here for more information about port protection from Littelfuse.