feature article

The FPGA is Half Full

Unwinding the Marketing Spin

Let’s say you are looking for a new house for your family. You’ve got a couple of contenders. One has four bedrooms, three baths, a two-car garage, and 3,000 square feet of living area. The other has three bedrooms, three baths, a three-car garage, and 3,200 square feet of living area.

Lining the two data sheets up, the houses are comparable. One shows a bit more living area, the other has an additional bedroom (which you would just use for a guest room anyway), and the additional garage isn’t much of a factor, since your family owns only two cars.

Weighing the two choices based on the data sheet makes sense – until you start reading the fine print. House #1, it turns out, doesn’t actually have 3,000 square feet. To get that number, they included a section of the yard that is covered by a roof, and the square footage number is “effective square feet.” Another footnote says that they have estimated the effective square feet based on a “livability factor,” since they deem the living space to be extra-efficient.

Reading further on house #2, there is a footnote saying that the heating system will support only two of the bedrooms being occupied at any one time. And – one of the bathrooms actually contains a bed, so it is counted as both a bedroom and a bathroom in the info sheet.

Welcome to the wonderful world of FPGA product tables.

When you shop for an FPGA for your project, you’ll see that the FPGA companies generously provide product selectors that tell you what resources are available on their chips. The problem is the details that are hidden fine print – and the ones that are not in print at all. Let’s start with the capacity of the FPGA itself. One family boasts “up to 480,000 logic cells.” OK, cool. Drill down to the fine print and the answer changes to “up to 478,000 logic cells.” Drilling down yet another level, we are told the number of logic cells is actually 477,760. Well, that’s just rounding up, right? And, it’s less than 1% difference, so why be picky?

But, those 478,000 cells – absolutely do not exist. Looking over one column, we see that the device physically contains 74,650 “slices.” Dropping to the footnotes, we see that a slice is made up of four LUTs and eight flip-flops. Multiplying 74,650 slices times four LUTs we get – 298,600 actual LUTs. Whoa! OK, that’s not just rounding. How do we turn 298,600 LUTs into 480,000? Well, back in the (very) old days, FPGAs used four-input LUTs. Newer ones use something like six-input LUTs. So – if we (generously) scale the number of six input LUTs to an equivalent number of legacy four-input ones, we’d still get only 450,000 – and that’s assuming that we get a perfect utilization of the extra inputs. The plot thickens…

Now let’s say you want to try to use those LUTs. This may come as a shock, but you can never use 100% of the LUTs on your FPGA. Typically, the routing resources won’t support completely routing anywhere near that number. If you’re clocking them very fast, you’ll also bump into power limitations. In fact, many designers tell us that they don’t get more than 60%-70% utilization in practice. So if we took the favorable 70% number, we’re looking at around 210K actual usable physical LUTs – on a device marketed as 480K.

That’s 210K unless, of course, you want to use some for “distributed RAM.” You see, when they’re trying to pump up the memory stats, they allow that you might want to use some of the LUT fabric to make memory instead of LUTs. You can have the LUTs or the RAM, but not both at the same time.

Life is more than LUTs and RAM, though. Today’s FPGAs have a wealth of other resources included. Take DSP blocks, for example. You’ll see some pretty impressive GMAC numbers given for FPGAs used in digital signal processing. Unfortunately, most of those numbers are idealized figures that you’d never see in real life. For example, if an FPGA boasts 1000 DSP blocks (where each DSP block contains one or more hard-wired multipliers and some accumulator/arithmetic and carry circuitry) they typically calculate the published GMAC number by multiplying the number of multipliers by the maximum operating frequency of those multipliers. If you manage to craft a real, useful design that comes even close to that situation, a lot of people would love to talk to you about engineering employment opportunities.

How about IO, though? The vendors are always bragging about their huge bandwidth of SerDes. You’ll see large numbers of transceivers capable of blistering-fast speeds (up to 28Gbps each on the current 28nm generation of devices). The thing is, with all that data coming into the chip, you need to be able to do something useful with it. That means you need lots of fast internal resources like LUTs, memory, and DSP blocks. In many of today’s devices, the SerDes bandwidth exceeds what the rest of the FPGA is capable of, for anything but the most well-behaved, straightforward designs. All that SerDes looks good on paper, but if you can’t use them all, they’re just taking up expensive silicon area, increasing your cost, and leaking power.

Chatting with a number of designers, we hear that under-utilizing FPGAs is pretty much an industry norm. If you’ve been using FPGAs for a while, you tend to mostly ignore the datasheet numbers and plan your design based on experience and preliminary output from the tools. If the tools say you can route your design, take advantage of the resources you need, and hit your power budget, then you can feel pretty comfortable with your selection of devices.

But, why have all those resources there in the first place if you can’t use them?

Well, first there are bragging rights and the reality of competition between the vendors. If one vendor has a million-cell FPGA, the other one needs to have 1.1 million. Specsmanship is an important part of marketing. Also, because of the wide variation in designs, each design may leave a different set of resources on the table. One design may max out the DSP blocks but not need all the LUT fabric. Another may be limited by the amount of RAM. Many are at the mercy of total IO pins or bandwidth. FPGA companies spend an amazing amount of engineering just trying to find the right balance of resources that will best serve the widest possible audience.

One area that has long been an architectural Achilles’ heel, however, is that of routing resources. Putting more routing on the chip is expensive. If you design an FPGA with so much routing resource that you can always route 100%, you’ve wasted a tremendous amount of space. Balancing the available routing with the other resources requires exhaustive trial-and-error with a large number and variety of designs. FPGA companies typically iterate with their proposed architecture through a huge test suite, adjusting the balance of resources each time until they hit a point where they get acceptable utilization on a diverse set of realistic designs. Xilinx has announced that their upcoming family includes a major rework of routing resources – aimed at letting us hit much higher utilization numbers than with previous families.

Certainly, the language and norms on FPGA specifications have become distorted over the years. Simply having the capacity defined in terms of an anachronistic architecture as a pseudo industry standard is confusing enough. Add to that the reality that almost no design will be able to come close to a perfect, balanced utilization of the resources on any given FPGA, and the situation can be downright confusing. There seems to be hope, however, in the direction the FPGA companies are taking, both with their design and with their marketing messages. It would be wonderful to be out of the era of marketing-driven specsmanship and into a new age of useful metrics for choosing the best part for our design work.

Until that day, the only strategy is to use the tools to get realistic fit estimates. Your designs, with your constraints, are the best (and only) sure-fire models that will tell you whether you can succeed with a particular device.

8 thoughts on “The FPGA is Half Full”

1. gabor@alacron.com says:

The use of bloated marketing numbers to define the FPGA size is nothing new. And in fact the “logic cell” number is a better yardstick for most designs than the old “system gates” number. At least I can find a multiplier that gets me from logic cells to LUTs. Still you’re right about needing to run the design through the tools to get the final picture. Often it’s gotchas like clock routing (yeah you get 32 global clock buffers but only 16 can reach any section of a chip) or other shared routing resources (Oh, you wanted to attach two adjacent clock pins to two PLLs, or have two adjacent I/Os running DDR on different clocks?). It’s been a long time since I was able to rely on a data sheet to tell me everything I needed to know about programmable logic.

2. Pingback: 123movies
3. Pingback: Youjizz
4. Pingback: coehuman Diyala
featured blogs
Sep 21, 2023
Wireless communication in workplace wearables protects and boosts the occupational safety and productivity of industrial workers and front-line teams....
Sep 21, 2023
Labforge is a Waterloo, Ontario-based company that designs, builds, and manufactures smart cameras used in industrial automation and defense applications. By bringing artificial intelligence (AI) into their vision systems with Cadence , they can automate tasks that are diffic...
Sep 21, 2023
At Qualcomm AI Research, we are working on applications of generative modelling to embodied AI and robotics, in order to enable more capabilities in robotics....
Sep 21, 2023
Not knowing all the stuff I don't know didn't come easy. I've had to read a lot of books to get where I am....
Sep 21, 2023
See how we're accelerating the multi-die system chip design flow with partner Samsung Foundry, making it easier to meet PPA and time-to-market goals.The post Samsung Foundry and Synopsys Accelerate Multi-Die System Design appeared first on Chip Design....

Chiplet Architecture Accelerates Delivery of Industry-Leading Intel® FPGA Features and Capabilities

Sponsored by Intel

With each generation, packing millions of transistors onto shrinking dies gets more challenging. But we are continuing to change the game with advanced, targeted FPGAs for your needs. In this video, you’ll discover how Intel®’s chiplet-based approach to FPGAs delivers the latest capabilities faster than ever. Find out how we deliver on the promise of Moore’s law and push the boundaries with future innovations such as pathfinding options for chip-to-chip optical communication, exploring new ways to deliver better AI, and adopting UCIe standards in our next-generation FPGAs.

To learn more about chiplet architecture in Intel FPGA devices visit https://intel.ly/45B65Ij

Intel's Chiplet Leadership Delivers Industry-Leading Capabilities at an Accelerated Pace

Sponsored by Intel

We're proud of our long history of rapid innovation in #FPGA development. With the help of Intel's Embedded Multi-Die Interconnect Bridge (EMIB), we’ve been able to advance our FPGAs at breakneck speed. In this blog, Intel’s Deepali Trehan charts the incredible history of our chiplet technology advancement from 2011 to today, and the many advantages of Intel's programmable logic devices, including the flexibility to combine a variety of IP from different process nodes and foundries, quicker time-to-market for new technologies and the ability to build higher-capacity semiconductors

To learn more about chiplet architecture in Intel FPGA devices visit: https://intel.ly/47JKL5h

featured chalk talk

Johnson RF Connectivity Solutions
The growing need for remote patient monitoring and wireless connectivity has made RF in medicine applications more important than ever before. In this episode of Chalk Talk, Amelia Dalton chats with Ketan Thakkar from Cinch Connectivity Solutions about the growing trends in medicine today that are encouraging the use of RF, why higher frequency, smaller form factor, cable assembly expansion and adapter expansion are vital components in today’s medical applications and why Johnson medical solutions could be a great fit for your next medical design.
Nov 28, 2022
35,238 views
All material on this site copyright © 2003 - 2023 techfocus media, inc. All rights reserved.