Trading Cards and the Art of Verification

When I was younger, I briefly became very interested in collecting trading cards – primarily baseball cards. It was always exciting to sort through my weekly pack. However, as time passed and my pile of cards grew, it became more difficult to find the ones that I needed to help fill out my collection. As a kid I was frustrated by what I thought was bad luck. Now years later I understand the math that made that frustration inevitable. It’s the same math that can lead to stymied attempts to achieve functional coverage closure in design verification using random stimulus generation.

The verification landscape has evolved significantly over the last decade. One oft-cited maxim is that the increase in verification complexity is proportional to a multiple of Moore’s Law – at least 10x for ASICs and maybe 100x or more when SOCs or embedded software is involved. This sort of scaling is bad news, especially since gate counts are already pushing several hundred million for a slew of popular consumer devices, such as game consoles.

Engineers often throw automation at increasingly complex problems, so it’s perhaps no surprise the same is true in verification. Constrained-random stimulus generation with stimulus functional coverage is perhaps the most obvious example of verification automation. This model-based approach to describing the stimulus domain and critical-to-cover portions of that domain injects automation into the process of generating verification stimuli. In addition to enabling a verification engineer to generate an order of magnitude more stimulus in a given period of time than would have been possible using directed tests, constrained-random stimulus generation produces “surprising” stimulus patterns – those that are valid according to the design specification but unlikely for the verification engineer to think of and test. Of course it’s useful to be surprised during verification, while it’s problematic or worse the closer you get to tapeout.

The main drawback to constrained random stimulus generation is unwanted repetition in the stimulus patterns. Repetition does not, in this case, breed success. With a small stimulus and functional coverage space, redundancy doesn’t pose a big obstacle to verification. However, as functional coverage and stimulus models become larger and more sophisticated, the limits imposed by this redundancy become apparent. Indeed the math behind these limits explains why it was difficult for me to ever complete my collection as I bought pack after pack of baseball cards.

Achieving functional coverage using randomly-generated stimulus, as well collecting a full set of trading cards, is described in probability theory by the Coupon Collector’s Problem. This problem derives the average number of random selections that must be made in order to collect a full set of items.

Assume we have a collection containing 50 cards. Each time we make a random selection, the probability of selecting a given card is one out of 50. However, the probability of selecting a unique card (one that will help us fill our collection) changes each time we add a new card to our collection.

When we make our first selection, we have a 100% probability of selecting a card that we need. By the time we’ve collected half the set, however, our probability of selecting a new item drops to 50% — we are just as likely to select a card that we already have as to select one that we need.

After a bit mathematical derivation, we find that if we want to collect a set of N items via random selection, we will need to make on the order of random selections. In other words, on average it will take us 195 random selections to complete our collection of 50 cards. Furthermore, collecting our last card once the other 49 have been collected will take us, on average, 50 random selections. With these statistics, it’s no wonder that achieving 80% functional coverage often seems easy, while the last 20% is painfully slow to achieve.

Figure 1 – Typical random coverage convergence on a 256-bin coverpoint

There is a complication, however, when attempting to use Coupon Collector’s Problem math to estimate the time it will take to achieve functional coverage closure — or to collect a full set of baseball cards for that matter. The equation models a uniformly-distributed random space, and neither baseball cards nor functional verification are quite that uniform. Complex constraint systems, where the constraint relationships require several improbable conditions to line up in order for a desired set of outputs to result, are one case where using the relationship to predict time to coverage closure breaks down.

Another case is where the stimulus space is significantly different than the functional coverage space. For example, consider the goal of trying to collect the team set for your favorite baseball team. In such a scenario, the N in the equation isn’t the number of players (the functional coverage space) on your favorite team. Rather, N is the number of unique cards in circulation (the stimulus space), which of course makes it much less probable that we can collect a full set simply by buying more packs of cards.

In verification, this problem occurs frequently because of functional coverage corner cases, and because some stimulus values or value ranges are simply more important to verify than others. For example, assume we are verifying a packet-processing engine that processes packets with payload lengths between 1 word and 16384 words. From a functional verification standpoint, we might wish to verify processing packets with very small payloads – perhaps 1, 2, 3, and 4-word, packets with very large payloads – perhaps 16381, 16382, 16383, and 16384-words, and 8 ranges of payload sizes in the middle. Our functional coverage model only has 16 coverage bins monitoring the packet size, but the probability of randomly selecting a payload of size 1 is actually 1 in 16384, making the probability of randomly testing these very important packet sizes is extremely low.

Figure 2 – Typical random coverage convergence on a 16-bin corner case coverpoint

Relying solely on randomness can lead to vast numbers of test cases that need to be run through a simulator, which of course takes time and money. I recall working with one customer that was trying to test 100,000 different combinations. Each test took about 30 minutes to run given the nature of the device and tasks to be executed.

When I started talking to the engineers working on this problem, they told me they’d been simulating for eight weeks running on multiple machines and, through random stimulus generation, had tested only slightly more than half those combinations. Here it’s important to stress that this goal, 100,000 test cases, is quite modest, especially in the context of verifying cutting-edge GPUs and CPUs. My point is that even modest goals can take a long time to reach. And as your goals get bigger, random stimulus generation might never get you to full coverage.

Currently, a common procedure for closing gaps in functional coverage is to manually inspect the coverage reports looking for functional coverage holes. In some cases, simply running more simulation – perhaps with a different random seed — may fill in the missing functional coverage. In other cases, a directed test may be developed to target the random-resistant stimuli. There are several problems here, of course. The manual analysis process is labor-intensive and probably not the best use of a skilled verification engineer’s time. In addition, using directed tests to target functional coverage holes somewhat defeats the purpose of using constrained random stimulus in the first place – the directed test only targets the stimulus values that we already knew we cared about. Finally, there’s a lot of uncertainty in the entire process, making it difficult to reliably stick to a verification schedule.

What’s needed is obvious: a stimulus-generation engine intelligent enough to look at our functional coverage model and prioritize generation of stimuli that increase the level of stimulus functional coverage. Fortunately, such intelligent testbench automation tools do exist, such as Questa inFact from Mentor Graphics. Obviously Mentor is not alone in this space. Cadence and Synopsys each have their own versions of this type of technology and several startups are bringing their own solutions to the market. The activity in the space is largely driven, I think, by the ubiquity of the problem.

Algorithms in Questa inFact comprehend the structure of the stimulus model, as well as being aware of the stimulus-coverage goals. These algorithms generate stimuli that prioritize efficiently targeting the stimulus coverage goals described by the coverage model, as well as eliminating redundant stimuli. Once a given stimulus coverage goal is satisfied, stimulus generation reverts to random generation, ensuring that “surprising” stimulus is still generated. The net result is an order of magnitude, or greater, increase in the rate of functional coverage closure. In addition, because the tools are able to reduce or eliminate redundant stimulus, coverage closure for stimulus functional coverage is a predictable process.

Here’s something else that’s predictable: kids are impatient. At least I was. I remember taking matters into my own hands by trading with friends for the last few cards I needed to have a complete set. Like most kids, I quickly figured out the folly of leaving everything to chance. That same lesson holds true when it comes to the limits of constrained random stimulus generation and the power of a more goal-oriented methodology, specifically intelligent testbench automation.

About the author:

Matthew Ballance is a Verification Technologist at Mentor Graphics, specializing in the inFact Intelligent Testbench Automation tool. He has 12 years of experience in the EDA industry, and has previously worked in the areas of hardware/software co-verification and transaction-level modeling.