feature article
Subscribe Now

Avoiding Failure Analysis Paralysis

Cadence Describes the DFM-Diagnostics Link

Back when I was a product engineer working on bipolar PALs (oops – I mean, PAL® devices), one of my main activities was figuring out what was wrong. That was most of the job of a product engineer: fix what’s broken. You don’t spend any time working on the stuff that’s working, you work on what isn’t working. Assuming it’s a chip that’s wrong, the process would typically start with a trip into the testing area to put a part on the tester and datalog it to see some evidence of things going awry. Armed with that, the next job was to spread a big schematic out on a table and start looking at the circuits, figuring out what could be causing the problem. You’d come up with a couple scenarios, and next you’d have to look in the actual chip.

Of course, in order to look at the chip, we had to spread a big layout sheet on a table to trace out where the circuits were physically located. Then we’d know where to look. The chip would have to be decapped – I could do that myself if it was a CERDIP (ceramic packaging, where you could pop off the top); otherwise you needed to go to one of those scary guys that knew just a bit too much about chemistry (and whom you wanted to keep happy with occasional gifts of jerky or sunflower seeds) to have a hole etched in the plastic. Hopefully that was enough, and then you could go into the lab and use microscopes and microprobes and oscilloscopes and such to poke through dielectric layers, perhaps cut a metal line to get to something below, and with any luck you’d identify a problem that could be fixed. In the worst case you had to go back to Scary Guy for more delayering, or perhaps a SEM session. Or – yikes – chemical analysis. It was all seat-of-the-pants, using forensic techniques worthy of CSI – Jurassic Edition, and you let your data and observations tell you what the next step should be.

Unfortunately, a few things have changed to complicate this serene pastoral picture of the past. Start with, oh, about a thousand more pins on the chip. Shrink the features way down, and multiply the number of transistors by, oh, say, a lot. Throw on a few extra layers of metal for good measure, and, well, you gotcherself a problem.

Diagnosing failing dice and then turning the information into useful measures for improving yields on current and future circuits is no trivial matter anymore. Not only have technical issues become more thorny, but even business issues have intruded into the picture. The urgency has also grown with the focus on Design for Manufacturing (DFM), an admittedly somewhat ill-defined series of technologies for improving the manufacturability of sub-wavelength chips (and whose real benefit is still subject to debate).

Following up on a presentation at the isQED conference, I was able to sit down with some of the folks from Cadence to get their view of what life looks like now. The process boils down to something that sounds rather straightforward and familiar: develop hypotheses about possible failure modes; gather lots of manufacturing data to support or weaken some of those hypotheses, and then narrow down the range of options for physical failure analysis (done by the modern-day scary guys – in the gender-neutral sense – that actually tear the stuff apart).

The challenges are partly those of scale. It’s no longer an easy matter to unroll a paper schematic onto the table in the Board Room. We’ve now gone paperless, and, even so, there are just too many things going on in a circuit to try to trace them by hand. That’s where tools can come in and identify, through simulation, all the logical scenarios that could contribute to the observed failure. The kinds of issues to be reviewed could include not only the traditional stuck-at faults, but also timing problems. An observed behavior could originate in any of a number of logic nodes, and having the candidates automatically identified can give you a solid set of candidate problems in a shorter time. Those candidates are pareto-ranked by level of confidence.

The next step was something of a problem for a period of time. This kind of yield analysis is most useful in the early days of a process. But let’s face it: with masks costing what they do, you have to have an opportunity for a huge yield increase to warrant new masks on a current product. As a result, while testing and inspection procedures for current products may benefit, many times the kinds of design improvements you learn about will apply only to future products. This kind of learning makes far more sense early in the lifetime of a given process.

But making sense out of the failures requires manufacturing data. Lots of it. And early in the life of a process, manufacturing data doesn’t look so good. And historically, fabs have been reluctant to let the data out. This wasn’t a problem before foundries were routine; you owned your own fab (as “real men” did back in the day), and you went to talk to your colleagues there. With the fab now owned by a different company, and that company not wanting to look bad compared to other foundries, there was much resistance to being open with data.

This issue is now more or less behind us; there really is no way to do solid engineering without having access to manufacturing data, so that business hurdle has been cleared. Resulting in the availability of data. Lots and lots of data. Tons of data. File it under “B” for, “Be careful what you ask for.” The next challenge then becomes making sense of all of that data as it relates to the particular failure scenarios under consideration. You can now, for example, look at a wafer, or series of wafers, to figure out where possible yield hot spots are. You can narrow in on some dice of interest and look at the test and manufacturing data from those parts. The idea is to correlate the possible failure modes with actual observances to further refine the list of promising hypotheses. The once daunting roster of things that might be wrong can be narrowed down, and the physical failure analysis folks will get a more focused list of things to look at for final evidence of what the issue is.

Many of the tools for this flow have been around for a while. For example, Cadence has had their Encounter Diagnostics tool since 2004. One of the missing links has been a means of viewing all of the manufacturing data in a coordinated manner; right now you sort of have to look at the data in more or less an ad hoc fashion. Cadence has been working on a tool that they’ve used in some select situations to help bridge the analysis of manufacturing data back to the original design; they’re still in the productization stage, but intend that this be a key piece of automation in a feedback loop that can refine the design and manufacturing rules.

So while the concepts driving yield enhancement haven’t changed, the motivations have gotten stronger, and tools have become critical for managing the complexity and the amount of data required for thorough analysis. On the one hand, it kinda makes you pine for a simpler time, when you just kind of rolled up your sleeves and sleuthed around. On the other hand, you can now focus more energy on those parts of the process where the human brain is the key tool, and let the EDA tools take care of some of the more mundane work.

Leave a Reply

featured blogs
Jul 6, 2020
If you were in the possession of one of these bodacious beauties, what sorts of games and effects would you create using the little scamp?...
Jul 3, 2020
[From the last episode: We looked at CNNs for vision as well as other neural networks for other applications.] We'€™re going to take a quick detour into math today. For those of you that have done advanced math, this may be a review, or it might even seem to be talking down...
Jul 2, 2020
In June, we continued to upgrade several key pieces of content across the website, including more interactive product explorers on several pages and a homepage refresh. We also made a significant update to our product pages which allows logged-in users to see customer-specifi...

Featured Video

Product Update: DesignWare® TCAM IP -- Synopsys

Sponsored by Synopsys

Join Rahul Thukral in this discussion on TCAMs, including performance and power considerations. Synopsys TCAMs are used in networking and automotive applications as they are low-risk, production-proven, and meet automotive requirements.

Click here for more information about DesignWare Foundation IP: Embedded Memories, Logic Libraries & GPIO

Featured Paper

Cryptography: Fundamentals on the Modern Approach

Sponsored by Maxim Integrated

Learn about the fundamental concepts behind modern cryptography, including how symmetric and asymmetric keys work to achieve confidentiality, identification and authentication, integrity, and non-repudiation.

Click here to download the whitepaper

Featured Chalk Talk

0 to 112 (Gbps PAM4) in 5 Seconds

Sponsored by Samtec

With serial connections hitting 112Gbps, we can’t mess around with our interconnect. We need engineered solutions that will keep those eyes open and deliver the signal integrity we need in our high-speed designs. In this episode of Chalk Talk, Amelia Dalton talks with Matt Burns of Samtec about the interconnect options for speeds up to 112Gbs, and Samtec’s Flyover interconnect technology.

Click here to download the Silicon-to-Silicon Solutions Guide