As chip design becomes larger and more complex, verification engineers are expanding constrained-random testing to meet the validation demand. Yet this expansion in the use of constraints creates a new set of challenges. Because the size and complexity of constraint problems are growing, verification engineers now also face verification performance and capacity issues. As a result, it is no longer enough to write constraints that simply function to validate a design. Today, verification engineers must also optimize the constraints they write for performance if they wish to have any hope of both successfully validating their design and meeting their deadlines. A scalable methodology for writing constraints and maintaining performance is increasingly becoming a necessity as more engineers write and debug ever-increasing amounts of larger, more complex sets of constraints.

This two-part article series looks at a scalable constraint methodology and provides an overview of some of the key constraint optimization challenges and strategies of concern to verification engineers. Part 1 of this article series focuses on verification IP reuse—detailing how a solver typically interprets constraints and providing a case study focused on a constraint-driven performance optimization strategy with respect to the flexible packet parser of a hypothetical networking ASIC. Part 2 of this article series focuses on SystemVerilog and OpenVera constructs and the significant impact on validation performance that can result from the differences between E Soft Constraints and OpenVera Default constraints.

Understanding & Managing Constraints

It is important to achieve a fundamental understanding of constraint semantics in order to optimize their performance. Programmers often automatically think of constraints in a procedural context since they are very familiar with C or Verilog. Not only is this view incorrect, however, but it can also lead to critical miscalculations. At their most basic level, constraints are a set of equations that define legal solutions for random variables. This means that all constraints are considered at the same time by the solver, independent of their order in the source code. This simultaneous consideration results in unexpected behavior and side effects that can adversely impact performance. These effects can be significant and mandate a thorough understanding of the nature of constraint semantics. Only by understanding how a solver typically interprets constraints can engineers write effective and efficient code.

To better comprehend and manage constraint behavior, it is useful to employ a matrix approach as well as a diagram approach. A matrix approach will be considered first. A matrix visualizes constraints by describing the entire space as defined by random variables.^[1]. The constraints serve to remove invalid or illegal portions of the matrix. The constraint problem can be solved by elaborating the entire solution space, removing all invalid entries and then selecting one of the remaining valid answers randomly. This approach is used by the SystemC solver and several other academic and commercial tools. The two key advantages of this approach are: (1) the distribution of results is uniform as all valid results are calculated; and (2) after the initial matrix is calculated, selecting solutions is fast, which is particularly applicable in a data generator, where repeat randomization of the same object occurs.

Following is a basic example of how the matrix benefits of uniform results and fast time-to-results are realized. The solution space for a constraint problem containing n random variables can be represented by an n dimensional matrix. The matrix is difficult to visualize when the number of variables becomes large and difficult for the solver to represent in a suitable memory size. Consider the simple constraint problem shown in Figure 1. The class has one constraint and two random variables (a, b), each 2 bits in size.

Article 1 - Figure 1.jpg

Figure 1 – Constraint Matrix Example Code

The potential solution space is 2 bits x 2 bits resulting in 16 possible values. The constraint removes some illegal values from the small two-dimensional matrix as shown in Figure 2.

Article 1 - Figure 2.jpg

Figure 2 – Constraint Matrix

The constraint specifies values of (a,b) that are illegal.

Examining the constraints from left to right:

a >= 2 -> b == 3;
If a is greater than or equal to 2, then b must equal 3.
In rows 2 & 3, only column 3 is valid
(a,b) != (2,0), (2,1), (2,2), (3,0), (3,1), (3,2)

Constraints are bi-directional, x->y is equivalent to !y->!x.

Examining the constraints from right to left:

!(b==3) -> !(a>=2) or b !=3 -> a < 2
If b is unequal to 3, then a must be less than 2
In column 0,1,2, rows 1,2 are valid
(a,b) != (2,0), (3,0), (2,1), (3,1), (2,2), (3,2)

The Network Approach

An alternative to the matrix approach for visualizing the constraint problem is to describe it as a network. Each random variable in the network is a node, and each constraint is a connection as shown as a simple example in Figures 3 and 4. The constraint solver assigns values to each of the nodes (variables) and checks that each of the connections (constraints) is met. Solving the problem involves simulating the network to ensure that all connections are consistent. This is a mathematically difficult problem to solve, and the various algorithms used have speed, memory and distribution tradeoffs. Without the proper understanding and approach to solving this problem in the framework of a network, substantial performance drawbacks and validation delays will be the inevitable result.

Article 1 - Figure 3.jpg

Figure 3 – Constraint Network Example Code

Article 1 - Figure 4.jpg

Figure 4 – Constraint Network

The capacity and speed of the solver are heavily dependent on the size of the solution space (a smaller problem would of course require less memory and be faster to solve as there are fewer solutions to consider). The example in Figure 4 has two random variables, each 4 bits in size. A total of 4-bits by 4-bits by 4-bits = 12 bits, which yields 4096 possible solutions. The solution space can be reduced significantly by analyzing the valid values of each random variable and propagating these ranges throughout the network. Figure 5 illustrates the steps to optimize the solution space.

Article 1 - Figure 5.jpg

Figure 5 — Solution Space Optimization

Reducing the solution space in this manner is equivalent to removing an entire row or column from the solution matrix, or annotating the network nodes with valid ranges. a, b, c now have nine valid values each (a=4..12, b=3..11, c=7..15). The example now has 9 x 9 x 9 = 729 potential solutions, 18 percent of the original size. Initially, it appears that three random variables are selected to solve the equations; however only two degrees of freedom are present, allowing further simplification. When a and b are randomly chosen, c is calculated (c = a + b). Alternatively, b and c can be chosen and b is calculated (b = a – c). The solver uses this information, enabling one of the variables to be calculated rather than randomly chosen, thereby removing one dimension from the matrix or one node from the network. The example now has 9 x 9 = 81 valid solutions, only 2 percent of the original size. This results in a significant performance benefit.

Networking ASIC Case Study

Networking ASICs often handle parsing of complex packet or frame header information. To fully test a networking ASIC, the stimulus generator must exhaust all random permutations and combinations of networking protocol headers. Test writers need to easily impose certain header orders using constraints to achieve functional coverage goals. A simple implementation of this constrained-random testbench has an exponential complexity, however, resulting in extremely slow performance. An optimal solution to the linear complexity is provided in this network domain case study with the removal of a major obstacle to the implementation of the random verification environment of the flexible packet parser.

To validate the flexible packet parser of the SoC requires generating different networking protocol headers in random order while constraining the position of one or more headers. For example, a generator of OSI transport layer headers^[2] should be able to randomize the order of 802.1q, SAP/SNAP and other headers while always constraining the Ethernet header to the first position. If we represent network headers with an integer value, we can simplify this problem where we need to randomly fill an array with N unique integers. Each value is an integer in the range 0..N-1, and each is used exactly once. Also, one of the integer values representing the Ethernet header must be present at a fixed location in the array. This case study also applies to any constrained array shuffling problems outside the networking domain, for example, the generation of concurrent unique transactions on an AXI bus in the CPU domain.

Simple code that implements the requirements using a pseudo-random shuffle function is shown in Figure 6.

Article 1 - Figure 6.jpg

Figure 6 – Packet Header Shuffle Example

A simple SystemVerilog implementation of packet generation is shown in Figure 7. The constraints limit each pointer to the range [0…N-1]; pointers must not be duplicated and the Ethernet pointer must be placed in the first position.

Figure 7 – Packet Header Constraint Example

The code is very flexible and is easy to extend when the generator is reused, since all constraints are solved concurrently. Each header is compared to all other headers resulting in a network diagram for a packet with five headers (SIZE = 5) as shown in Figure 8.

Article 1 - Figure 8.jpg

Figure 8 – Packet Header Constraint Network

Figure 9 (Default curve) shows the Synopsys VCS constraint solver performance with respect to the array sizes (similar results were obtained from the Cadence IUS constraint solver). The problem with the implementation is poor solver performance. As all constraints are solved together, the number of constraints increases as O(n²) relative to the number of packet headers or variables. Runtime increases exponentially relative to the number of packet headers, making this approach unsuitable for problems with a large number of headers. Since the random packet generator in this case study has over 20 headers resulting in 400 constraints, this approach severely limits the scalability of the generator.

Figure 9 – Packet Header Results

“Less than” Optimization / void() Partition Optimization

The first optimization is to compare each pair of elements in the array only once. This can be done by replacing (i != j) with (i < j) in Figure 7. This simple modification removes 50 percent of the constraints, resulting in a 2X speedup as shown in Figure 9 (GT curve).

The next optimization splits the O(n²) constraints into n individual constraints, significantly reducing the complexity of the problem to O(n). For this protocol, the Ethernet header is the first entry in the array, allowing each entry in the array to be solved sequentially. SystemVerilog specifies that function calls in constraints blocks are evaluated first. Calling a function that returns its argument will partition the header array into n sequential partitions, as shown in Figure 10 with results in Figure 9 (Part curve). It is important to note that the significant limitation of this approach is that only the first packet header can be reliably constrained. If other headers are constrained, a constraint failure will occur if that header value has already been chosen for an earlier entry.

Figure 10 – Splitting into N partitions using void()

Scenario Optimization

The final implementation in this case study is a manual partitioning of the array elements and the simultaneous maintenance of the flexibility of constraints for the entire array. Each element is solved individually while considering the overall constraints and the values of all elements already chosen. This results in some additional code and complexity as shown in Figure 11. The header class wraps the integer header value and maintains a list of values already chosen during the randomization of this packet. This class can be reused for other protocols. The 802_hdr class extends the reusable header class. It adds constraints specific to the 802 protocol and forces the Ethernet header to be in position 0. The packet class instantiates an array of header objects. The header array is declared as non-random since each element is randomized by the procedural loop pre_randomize().

Figure 11 – Scenario Implementation

As one element of the array is being randomized at a time, the double-nested constraint loop is not present and the number of constraints is relatively low. The performance is linear and scales with O(n), so this implementation is suitable for large header sizes as shown in Figure 9 (SCN Curve). The tradeoff with this solution is additional code complexity and the restriction that all constraints must be presen in the header class. This results in a slightly slower speed than the partition solution, but significantly faster speed than the original implementation. Although external constraints can be used for some situations, packet.randomize with { hdr[0].value==`SAP } is not allowed with this coding style.

Closing Thoughts

All algorithms entail a tradeoff between complexity, speed and memory. Constraints, however, are particularly susceptible to performance issues as it is very easy to create a complex set of equations that are very difficult or computationally expensive to solve. This necessitates consistent monitoring of the testbench performance and the regular refactoring of code to reduce significant slowdowns by verification teams. As chips continue to become larger and more complex, the need to optimize constrained-random testing will only continue to increase. In the interest of better understanding constraint optimization strategies, Part 2 of this article series will further explore this topic by focusing on the differences between E Soft Constraints and OpenVera Default constraints.

AUTHOR BIOS

Benjamin Chen, benxchen@cisco.com

Benjamin Chen is a design and verification engineer at Cisco Systems. Over the past eight years, he has contributed significantly to the successful tape-out of multiple silicon-proven ASICs, 3M to 22M gates and in technologies ranging from 0.65 to 0.18u. He holds patents on technologies used in switching platforms and network storage systems.

Harish Krishnamoorthy, hariskri@cisco.com

Harish Krishnamoorthy is a hardware engineer at Cisco Systems. Over the past five years, Harish has specialized in embedded systems, software engineering and networking. Prior to Cisco, Harish was a systems engineer at Solidus Networks.

Srinath Atluri, atluri@cisco.com

Srinath Atluri is an engineering manager at Cisco Systems, responsible for design and verification. Mr. Atluri manages the verification of multiple ASIC and SoC projects for the next generation network storage and switching platform, Nexus 7000, one of the most strategically important product lines at Cisco.

Nimalan Siva, nimalan@cisco.com

Nimalan Siva is a design and verification lead at Cisco. He manages all aspects of the verification efforts of multi-million gate ASICs for network storage and switching platforms. Prior to Cisco, Nimalan was a software and verification engineer at several leading companies including Force 10 Networks, Xerox PARC, Hewlett Packard and Nortel.

Alexander Wakefield, alexw@synopsys.com

Alexander Wakefield is a Principal Corporate Applications Engineer at Synopsys. During the past 11 years at Synopsys, he has worked as a verification consultant on various processor and SoC projects. His primary focus is on constrained-random validation and testbench methodology.

Balamurugan Veluchamy, bmurugan@synopsys.com

Balamurugan Veluchamy is a Corporate Applications Engineer at Synopsys. Over the past six years, Bala has guided key semiconductor companies on verification with SystemVerilog using VMM, optimal constrained random verification techniques, verification planning and management, and low power verification.