feature article
Subscribe Now

Apples to Apples

Why Comparisons Can't be Simple

We’ve all had a fun time complaining.

It’s not like Marketing (upper-case “M”) was without blame, either.

When the new millennium dawned, we’d had enough of “System Gates” – the metric that FPGA companies used to describe and inflate the capacity of their devices.  In those days, vendors gave us a huge number – something like the number of transistors divided by three or four, as the “system gate” count of their devices.  Most of the transistors on an FPGA, however, are involved in configuration logic, routing, and other structures that don’t exist in a typical ASIC design.  Unfortunately, with their customers being engineers, the common practice quickly became “divide by 10 to get a realistic ASIC gate equivalent.” The bad side-effect of the common practice was the common notion that “FPGA Vendors are a bunch of liars.”

FPGA companies didn’t like that reputation, and yet they didn’t want to confuse their customers either.  It didn’t help anybody if an engineer read a claim of 3 million system gates for an FPGA only to discover that their 2-million gate ASIC design would require six of those FPGAs.  In a move to regain their credibility, FPGA companies began stating their capacities in terms of the actual number of look-up tables (LUTs) in the fabric.  For a (very) brief time we were all happy.  If we looked under a hypothetical microscope at an 80,000 LUT FPGA, we’d stand a good chance of counting 80,000 little LUT-like structures (if we didn’t mind the eye strain and we had a lot of patience and no real engineering work to do.)

Then Marketing stepped in.  “Hey, those OTHER guys have a few more LUTs than we do.  What can we put in the data sheet?” 

With the promise of beers on Marketing’s expense accounts, engineers racked their brains for a new algebra and, as engineers tend to do, came up with a clever solution. “Well, we think we have a more efficient carry-chain than those other guys; that should be worth 15% or so – let’s just increase our LUT count by 15% and call them “effective LUTs” (or a proprietary variation thereof).

This, of course, was the hype heard round the world (OK, not really.  Just a few of the more FPGA-obsessed nerds actually noticed at the time), and the battle of the exaggerated marketing claims resumed with renewed vigor and creativity.  Each company came up with a way to make their devices seem larger when compared to their rivals. 

Disputing the effective LUT count was only the beginning.  Don’t forget, you had to credit yourself for the amount of block-RAM included on your device.  After all, that’s part of the capability too. While we were on the subject of memory – why not double-count the registers on LUTs, both as part of the “maximum memory” available on the device and as part of the LUT fabric.  Then, we could say we had 100,000 LUTs and also “up to” 5Mb of RAM.  (extra marketing tip – always count your memory in bits, not bytes — the number is bigger.)  Other non-fabric IP needed to be counted in the total as well – if your FPGA had a few hundred hard-core 18X18 multipliers (or DSP blocks), you needed to count something for those, and what about those hard-core processors?  They certainly deserved a few extra “equivalent” LUTs.

Then, the further unthinkable happened.  We changed the LUT.

Altera probably started it.

Long ago.  University students got their PhDs doing studies on things like the optimal width for look-up tables in programmable logic designs.  They ran numerous experiments – taking hypothetical FPGA architectures populated with LUTs of varying widths (2-input, 3-input, 4-input, 5-input, etc.) and synthesizing all kinds of designs on them, measuring which gave the most efficient use of logic resources.  The 4-input LUT won the battle and became the industry standard.  It was accepted as fact that the 4-input look-up table was the optimum structure for programmable devices, and the industry infrastructure settled in on that idea for the long-haul.

Unfortunately, the long-haul in high-tech is never more than single-digit years.

As we got to smaller process geometries, gates got faster and routing got proportionally slower.  That means that the percentage of total delay from logic resources began to be overshadowed by the percentage of delay from routing resources.  When we had earlier looked for the optimal width of LUTs, it was with the assumption that logic was expensive and routing was either free or very, very cheap.  In this new world, however, it was routing that was expensive (in terms of delay) and logic that was cheap.  Now, connecting two LUTs together to realize a combinatorial function with more than 4 inputs was less attractive because it involved more routing.  This re-wrote the book on the optimal LUT width.

Altera came out with what they called an “adaptive logic module” which was really a wider (6-7 input) LUT-like structure.  Xilinx soon followed with their own announcement. (Although there is lingering debate about who was really first.  Anybody surprised?)

This clobbered our math.

Now, vendors didn’t want to put out datasheets with smaller numbers – trying to explain that these were actually smaller numbers of bigger objects.  Instead, they opted to come up with an “equivalent” number (since we were already into marketing-equivalence anyway).  Now, a constant coeffeicient was applied to the number of 6-7 input LUTs to approximate the number of 4-input LUTs it would take to do about the same thing.  Now, the formula for density was something like “Take the number of LUTs and multiply by some constant, add in some number for the amount of RAM, throw in a few for things like processors, DSP blocks, and I/O, and see how that number compares with our competitor.  If it’s smaller, bump the coefficients by 10% and try again.

Then came the benchmarks.

FPGA companies began to benchmark their devices on “customer” designs in order to see whether their devices were a higher or lower percentage utilization than their competitors with the same design.  Based on those results, press releases were sometimes issued stating that one company’s devices were larger than another company’s.  The battle got even uglier.

For engineers using FPGAs, however, the question is much simpler:  What FPGA is the best fit for MY design?

Usually, we build our design from a collection of pre-done IP, with a little bit of our own custom logic mixed in.  It would be nice if we knew how many UARTs, soft-core processors, RAM blocks, barrel shifters, FFTs, or whatever we could build with a given number of a vendor’s “equivalent logic elements” we needed.  Then, we could see what the capacity of these devices means for us – in real-world terms, for our project.

With FPGAs now available with dozens of mixtures of features – both hard-wired and programmable, it is impossible to come up with one density or capacity number that gives an accurate picture.  The best thing to do is to start from the IP blocks you need to use and work your way up.  Most IP datasheets tell the resources required for that block, and you can total those up with a fair level of confidence to find out what device will hold your design (with some room to spare, if you’re smart).  Based on that, you can pick the device that is best for your project rather than the vendor with the best marketing department. 

Don’t be expecting the datasheets to change any time soon.

Leave a Reply

featured blogs
Apr 25, 2024
Structures in Allegro X layout editors let you create reusable building blocks for your PCBs, saving you time and ensuring consistency. What are Structures? Structures are pre-defined groups of design objects, such as vias, connecting lines (clines), and shapes. You can combi...
Apr 24, 2024
Learn about maskless electron beam lithography and see how Multibeam's industry-first e-beam semiconductor lithography system leverages Synopsys software.The post Synopsys and Multibeam Accelerate Innovation with First Production-Ready E-Beam Lithography System appeared fir...
Apr 18, 2024
Are you ready for a revolution in robotic technology (as opposed to a robotic revolution, of course)?...

featured video

MaxLinear Integrates Analog & Digital Design in One Chip with Cadence 3D Solvers

Sponsored by Cadence Design Systems

MaxLinear has the unique capability of integrating analog and digital design on the same chip. Because of this, the team developed some interesting technology in the communication space. In the optical infrastructure domain, they created the first fully integrated 5nm CMOS PAM4 DSP. All their products solve critical communication and high-frequency analysis challenges.

Learn more about how MaxLinear is using Cadence’s Clarity 3D Solver and EMX Planar 3D Solver in their design process.

featured paper

Designing Robust 5G Power Amplifiers for the Real World

Sponsored by Keysight

Simulating 5G power amplifier (PA) designs at the component and system levels with authentic modulation and high-fidelity behavioral models increases predictability, lowers risk, and shrinks schedules. Simulation software enables multi-technology layout and multi-domain analysis, evaluating the impacts of 5G PA design choices while delivering accurate results in a single virtual workspace. This application note delves into how authentic modulation enhances predictability and performance in 5G millimeter-wave systems.

Download now to revolutionize your design process.

featured chalk talk

Gas Monitoring and Metering with Sensirion SFC6000/SFM6000 Solutions
Sponsored by Mouser Electronics and Sensirion
In this episode of Chalk Talk, Amelia Dalton and Negar Rafiee Dolatabadi from Sensirion explore the benefits of Sensirion’s SFM6000 Flow Meter and SFC Flow Controller. They examine how these solutions can be used in a variety of applications and how you can get started using these technologies for your next design.
Jan 17, 2024
14,150 views