feature article
Subscribe Now

When Reliability Analysis Meets Silicon Lifecycle Management

I’ve recently been chatting with folks from Synopsys and Concertio, and now my head is so full of “stuff” regarding things like hyper-convergent chip design, reliability analysis, real-time performance optimization, and silicon lifecycle management that I don’t know whether I’m coming or going.

I’ve told this tale before (and I’ll doubtless tell it again), but when I worked on my first digital ASIC design circa 1980 at International Computers Limited (ICL) in Manchester, England, the only design tools I had at my disposal were pencils, paper, data books, and my poor old noggin. Of course, the ASICs in question were implemented at the 5-micron technology node and boasted only a couple of hundred equivalent logic gates, but — still and all — they could be tricky little rascals if you didn’t keep a beady eye on them.

Functional verification involved the rest of the team looking at your gate- and register-level schematics and asking pointed questions like “What’s this bit do” and “Why did you do it that way?” Once you’d answered these questions to their satisfaction, you moved on to timing verification. This involved your identifying any critical paths by eye, and then calculating and summing the delays associated with those paths by hand (no one I knew owned so much as a humble 4-function electronic calculator).

Later, when I moved to a company called Cirrus Designs, and then on to its sister company called Cirrus Computers, both of which were eventually acquired by the American test equipment company GenRad, I got to work with HILO 2, which was one of the early digital logic simulators. In fact, HILO-2 was the first true register transfer-level (RTL) simulator, although the term “RTL” had not yet been coined in those days of yore. This little scamp was extremely sophisticated for its time because it had three aspects to it: a logic simulator that could run using minimum (min), typical (typ), and maximum (max) delays; a fault simulator that could also run using min, typ, or max delays; and a dynamic timing simulator that could run using min-typ, typ-max, or min-max delay pairs.

I was particularly impressed with the dynamic timing simulator. Where the regular simulators would transition from 0 to 1 to 0 using the selected delay mode, the dynamic timing simulator would transition from 0 to “gone high don’t know when” to 1 to “gone low don’t know when” to 0, where the “don’t know when” states encompassed the relevant delay pair.

Although all of this took place only 30 to 40 years ago as I pen these words, I could never have envisaged the sorts of tools available to the design engineers of today. Of course, today’s design engineers need awesomely powerful and incredibly sophisticated tools because they are working with hyper-convergent designs and design flows. By hyper-convergent design, we are talking about things like a single die featuring a diverse set of analog, digital, and mixed-signal components, or a single package containing multiple dice that are potentially implemented on different process nodes. All of this involves larger and more complex circuits running at higher frequencies, the reduced margins and increased parasitics associated with advanced process nodes, and the need for faster, higher capacity accurate simulators.  

Meanwhile, a hyper-convergent design flow is based on a common data model that is shared by all of the tools in the flow. Having a common data model facilitates the sharing of information between different phases of the design, thereby reducing design iterations and time-to-market (TTM).

Reliability Analysis

Do you recall my recent column here on EE Journal regarding the book A Hands-On Guide to Designing Embedded Systems by Adam Taylor, Dan Binnun, and Saket Srivastava? If so, you may remember my saying: “Since Adam is one of the few people I know to successfully (and intentionally) have his FPGA-based designs launched into space, there’s a huge amount of material on the topic of reliability, including how to perform worst-case analysis and how to evaluate the reliability of the system.”

The reason I mention this here is that one of the people I’ve been video-chatting with is Anand Thiruvengadam, who is director of product management at Synopsys. Anand’s mission was to heighten my understanding of Synopsys’s new PrimeSim Reliability Analysis solution.

Anand started by telling me that PrimeSim Reliability Analysis features a unified workflow of proven, foundry-certified reliability analysis technologies, it provides faster time-to-reliability compliance for mission critical applications, it offers full lifecycle coverage that is compliant with standards such as ISO 26262, and it’s tightly integrated with PrimeSim Continuum and the PrimeWave Design Environment. I tried to maintain what I hoped appeared to be a knowledgeable expression on my face, but I fear Anand was not fooled, so he quickly showed me the following diagram:

Primetime Reliability Analysis offers a unified workflow of proven technologies for full lifecycle reliability verification (Image source: Synopsys).

Ah, now everything is clear. PrimeSim Circuit Check (CCK) offers programmable static analog and digital circuit checks with full chip verification in a matter of minutes, PrimeSim Custom Fault improves test coverage, reduces defect escapes, verifies safety with ISO 26262 compliance, and accelerates silicon failure analysis. PrimeSim AVA provides fast design marginality analysis using machine learning (ML) running on the fly to capture 100X to 1,000X fewer samples while offering results whose accuracy is within 1% of PrimeSim HSPICE. PrimeSim SPRES provides signoff power/ground integrity analysis that augments EM and IR analysis, that can be deployed both early in the design cycle and later for signoff, and that provides fast static analysis, handling 1M+ element networks in minutes. Signoff EMIR analysis, which is used to ensure electro-thermal reliability, is provided by the combination of StarRC (high-capacity power and ground optimization), PrimeSim EMIR (high-performance, foundry-certified EMIR analysis), and Custom Compiler (advanced “what-if” analysis and debug). Last, but certainly not least, PrimeSim MOSRA means we can ensure long operating lifetimes with high-performance device aging analysis. Phew!

Silicon Lifecycle Management

Just when I thought things couldn’t get any more interesting… they did. This is the point where we switch gears slightly to consider the topic of the Synopsys integrated Silicon Lifecycle Management (SLM) platform called SiliconMAX, which helps us to improve silicon operational metrics at every phase of a device’s lifecycle through the intelligent analysis of ongoing silicon measurement. (The reason I said “switch gears slightly” is that Reliability Analysis and Silicon Lifecycle Management go hand-in-hand.)

The general SLM approach can be summarized as shown below. We start with a variety of monitors and sensors that are intelligently embedded throughout each chip, and we use these little ragamuffins to generate a rich data set that feeds analytical engines that enable optimizations at every stage in each device’s lifecycle — in-design (using sensor-based silicon aware optimization), in-ramp (using product ramp and accurate failure analysis), in-production (using volume test and quality management), and in-field (using predictive maintenance and optimized performance).

General SLM approach (Image source: Synopsys)

In June 2020, Synopsys announced it had acquired Qualtera, a fast-growing provider of collaborative high-performance, big data analytics for semiconductor test and manufacturing. In November 2020, it announced that it had acquired Moortec, a leading provider of in-chip monitoring technology specializing in process, voltage and temperature (PVT) sensors.

As an aside, by some strange quirk of fate, I wrote a column on Moortec just a few months before the Synopsys acquisition took place (see Distributed On-Chip Temperature Sensors Improve Performance and Reliability). This isn’t the first time this has happened to me. In one case, a startup company for whom I created the web content was acquired before they’d had the time to pay me (the new parent company paid me later). Of course, I’m not saying that my columns are so powerfully presented that any company I write about is certain to be acquired (but I’m not not saying it, either).

All of this brings us to a company called Concertio, which developed an innovative AI-powered performance optimization tool whereby an SLM agent continuously monitors the interactions between operating applications in the field and the underlying system environment.

Well, wouldn’t you just know it? On 1 November 2021, Synopsys announced that it was enriching its SLM solution with real-time, in-field optimization technologies by acquiring Concertio. Happily, I got to chat with Steve Pateras, the Senior Director of Marketing for Test Products in the Design Group at Synopsys.

What we are talking about here is adding instrumentation without affecting performance or the design flow. In addition to things like temperature and voltage, this instrumentation will even be able to measure path delays in silicon chips. The resulting data is amassed not only “now,” but over time. Using this information, the agent’s optimization engine can adapt and reconfigure the system dynamically — modifying application settings, operating system settings, and network settings — resulting in a self-tuning system that’s always optimized for its current usage. These agents can be deployed in tiny devices, capable devices, and powerful devices to provide dynamic optimization-as-a-service at any scale.

With the addition of Concertio, Synopsys’s SLM capabilities now extend beyond design and test and to the in-field optimization realm, providing a way to perform real-time analysis and optimization of software running on systems, including the high-performance processors powering data centers or the compute engines in automotive applications, for example.

I don’t know about you, but my head is currently buzzing with ideas. Using the technologies discussed in this column, we can now design ultra-reliable systems and then monitor and fine-tune their performance in the field. We truly do live in an age of wonders. What say you? What are your thoughts on all of this?

Leave a Reply

featured blogs
Dec 4, 2023
The OrCAD X and Allegro X 23.1 release comes with a brand-new content delivery application called Cadence Doc Assistant, shortened to Doc Assistant, the next-gen app for content searching, navigation, and presentation. Doc Assistant, with its simplified content classification...
Nov 27, 2023
See how we're harnessing generative AI throughout our suite of EDA tools with Synopsys.AI Copilot, the world's first GenAI capability for chip design.The post Meet Synopsys.ai Copilot, Industry's First GenAI Capability for Chip Design appeared first on Chip Design....
Nov 6, 2023
Suffice it to say that everyone and everything in these images was shot in-camera underwater, and that the results truly are haunting....

featured video

Dramatically Improve PPA and Productivity with Generative AI

Sponsored by Cadence Design Systems

Discover how you can quickly optimize flows for many blocks concurrently and use that knowledge for your next design. The Cadence Cerebrus Intelligent Chip Explorer is a revolutionary, AI-driven, automated approach to chip design flow optimization. Block engineers specify the design goals, and generative AI features within Cadence Cerebrus Explorer will intelligently optimize the design to meet the power, performance, and area (PPA) goals in a completely automated way.

Click here for more information

featured paper

3D-IC Design Challenges and Requirements

Sponsored by Cadence Design Systems

While there is great interest in 3D-IC technology, it is still in its early phases. Standard definitions are lacking, the supply chain ecosystem is in flux, and design, analysis, verification, and test challenges need to be resolved. Read this paper to learn about design challenges, ecosystem requirements, and needed solutions. While various types of multi-die packages have been available for many years, this paper focuses on 3D integration and packaging of multiple stacked dies.

Click to read more

featured chalk talk

Introduction to Bare Metal AVR Programming
Sponsored by Mouser Electronics and Microchip
Bare metal AVR programming is a great way to write code that is compact, efficient, and easy to maintain. In this episode of Chalk Talk, Ross Satchell from Microchip and I dig into the details of bare metal AVR programming. They take a closer look at the steps involved in this kind of programming, how bare metal compares with other embedded programming options and how you can get started using bare metal AVR programming in your next design.
Jan 25, 2023