feature article
Subscribe Now

Apples to Apples

Why Comparisons Can't be Simple

We’ve all had a fun time complaining.

It’s not like Marketing (upper-case “M”) was without blame, either.

When the new millennium dawned, we’d had enough of “System Gates” – the metric that FPGA companies used to describe and inflate the capacity of their devices.  In those days, vendors gave us a huge number – something like the number of transistors divided by three or four, as the “system gate” count of their devices.  Most of the transistors on an FPGA, however, are involved in configuration logic, routing, and other structures that don’t exist in a typical ASIC design.  Unfortunately, with their customers being engineers, the common practice quickly became “divide by 10 to get a realistic ASIC gate equivalent.” The bad side-effect of the common practice was the common notion that “FPGA Vendors are a bunch of liars.”

FPGA companies didn’t like that reputation, and yet they didn’t want to confuse their customers either.  It didn’t help anybody if an engineer read a claim of 3 million system gates for an FPGA only to discover that their 2-million gate ASIC design would require six of those FPGAs.  In a move to regain their credibility, FPGA companies began stating their capacities in terms of the actual number of look-up tables (LUTs) in the fabric.  For a (very) brief time we were all happy.  If we looked under a hypothetical microscope at an 80,000 LUT FPGA, we’d stand a good chance of counting 80,000 little LUT-like structures (if we didn’t mind the eye strain and we had a lot of patience and no real engineering work to do.)

Then Marketing stepped in.  “Hey, those OTHER guys have a few more LUTs than we do.  What can we put in the data sheet?” 

With the promise of beers on Marketing’s expense accounts, engineers racked their brains for a new algebra and, as engineers tend to do, came up with a clever solution. “Well, we think we have a more efficient carry-chain than those other guys; that should be worth 15% or so – let’s just increase our LUT count by 15% and call them “effective LUTs” (or a proprietary variation thereof).

This, of course, was the hype heard round the world (OK, not really.  Just a few of the more FPGA-obsessed nerds actually noticed at the time), and the battle of the exaggerated marketing claims resumed with renewed vigor and creativity.  Each company came up with a way to make their devices seem larger when compared to their rivals. 

Disputing the effective LUT count was only the beginning.  Don’t forget, you had to credit yourself for the amount of block-RAM included on your device.  After all, that’s part of the capability too. While we were on the subject of memory – why not double-count the registers on LUTs, both as part of the “maximum memory” available on the device and as part of the LUT fabric.  Then, we could say we had 100,000 LUTs and also “up to” 5Mb of RAM.  (extra marketing tip – always count your memory in bits, not bytes — the number is bigger.)  Other non-fabric IP needed to be counted in the total as well – if your FPGA had a few hundred hard-core 18X18 multipliers (or DSP blocks), you needed to count something for those, and what about those hard-core processors?  They certainly deserved a few extra “equivalent” LUTs.

Then, the further unthinkable happened.  We changed the LUT.

Altera probably started it.

Long ago.  University students got their PhDs doing studies on things like the optimal width for look-up tables in programmable logic designs.  They ran numerous experiments – taking hypothetical FPGA architectures populated with LUTs of varying widths (2-input, 3-input, 4-input, 5-input, etc.) and synthesizing all kinds of designs on them, measuring which gave the most efficient use of logic resources.  The 4-input LUT won the battle and became the industry standard.  It was accepted as fact that the 4-input look-up table was the optimum structure for programmable devices, and the industry infrastructure settled in on that idea for the long-haul.

Unfortunately, the long-haul in high-tech is never more than single-digit years.

As we got to smaller process geometries, gates got faster and routing got proportionally slower.  That means that the percentage of total delay from logic resources began to be overshadowed by the percentage of delay from routing resources.  When we had earlier looked for the optimal width of LUTs, it was with the assumption that logic was expensive and routing was either free or very, very cheap.  In this new world, however, it was routing that was expensive (in terms of delay) and logic that was cheap.  Now, connecting two LUTs together to realize a combinatorial function with more than 4 inputs was less attractive because it involved more routing.  This re-wrote the book on the optimal LUT width.

Altera came out with what they called an “adaptive logic module” which was really a wider (6-7 input) LUT-like structure.  Xilinx soon followed with their own announcement. (Although there is lingering debate about who was really first.  Anybody surprised?)

This clobbered our math.

Now, vendors didn’t want to put out datasheets with smaller numbers – trying to explain that these were actually smaller numbers of bigger objects.  Instead, they opted to come up with an “equivalent” number (since we were already into marketing-equivalence anyway).  Now, a constant coeffeicient was applied to the number of 6-7 input LUTs to approximate the number of 4-input LUTs it would take to do about the same thing.  Now, the formula for density was something like “Take the number of LUTs and multiply by some constant, add in some number for the amount of RAM, throw in a few for things like processors, DSP blocks, and I/O, and see how that number compares with our competitor.  If it’s smaller, bump the coefficients by 10% and try again.

Then came the benchmarks.

FPGA companies began to benchmark their devices on “customer” designs in order to see whether their devices were a higher or lower percentage utilization than their competitors with the same design.  Based on those results, press releases were sometimes issued stating that one company’s devices were larger than another company’s.  The battle got even uglier.

For engineers using FPGAs, however, the question is much simpler:  What FPGA is the best fit for MY design?

Usually, we build our design from a collection of pre-done IP, with a little bit of our own custom logic mixed in.  It would be nice if we knew how many UARTs, soft-core processors, RAM blocks, barrel shifters, FFTs, or whatever we could build with a given number of a vendor’s “equivalent logic elements” we needed.  Then, we could see what the capacity of these devices means for us – in real-world terms, for our project.

With FPGAs now available with dozens of mixtures of features – both hard-wired and programmable, it is impossible to come up with one density or capacity number that gives an accurate picture.  The best thing to do is to start from the IP blocks you need to use and work your way up.  Most IP datasheets tell the resources required for that block, and you can total those up with a fair level of confidence to find out what device will hold your design (with some room to spare, if you’re smart).  Based on that, you can pick the device that is best for your project rather than the vendor with the best marketing department. 

Don’t be expecting the datasheets to change any time soon.

Leave a Reply

featured blogs
Jun 15, 2021
Samtec Flyover® Twinax Cable assemblies allow designers to extend signal reach and density, enabling 112 Gbps PAM4 performance. Samtec Flyover systems are commonly used in mid-board applications, with a cable assembly connector located next to the chip. The signal path ...
Jun 15, 2021
We share key automotive cybersecurity considerations for connected vehicle technology such as automotive WiFi & Bluetooth, along with NHTSA best practices. The post Closing the 'Door' on Remote Attackers by Securing Wireless Paths into Vehicles appeared first on From Si...
Jun 15, 2021
At the recent TSMC 2021 Online Technology Symposium, the keynote to open the show was delivered by Dr C. C. Wei, TSMC's CEO. In addition to C.C. himself, there were guest appearances by Lisa Su,... [[ Click on the title to access the full blog on the Cadence Community s...
Jun 14, 2021
By John Ferguson, Omar ElSewefy, Nermeen Hossam, Basma Serry We're all fascinated by light. Light… The post Shining a light on silicon photonics verification appeared first on Design with Calibre....

featured video

Reduce Analog and Mixed-Signal Design Risk with a Unified Design and Simulation Solution

Sponsored by Cadence Design Systems

Learn how you can reduce your cost and risk with the Virtuoso and Spectre unified analog and mixed-signal design and simulation solution, offering accuracy, capacity, and high performance.

Click here for more information about Spectre FX Simulator

featured paper

What is a Hall-effect sensor?

Sponsored by Texas Instruments

Are you considering a Hall-effect sensor for your next design? Read this technical article to learn how Hall-effect sensors work to accurately measure position, distance and movement. In this article, you’ll gain insight into Hall-effect sensing theory, topologies, common use cases and the different types of Hall-effect sensors available today: Hall-effect switches, latches and linear sensors.

Click to read more

Featured Chalk Talk

General Port Protection

Sponsored by Mouser Electronics and Littelfuse

In today’s complex designs, port protection can be a challenge. High-speed data, low-speed data, and power ports need protection from ESD, power faults, and more. In this episode of Chalk Talk, Amelia Dalton chats with Todd Phillips from Littelfuse about port protection for your next system design.

Click here for more information about port protection from Littelfuse.