feature article
Subscribe Now

Power Primer

Is That My FPGA Burning?

We’ve talked about power a lot on these pages over the past year. We’ve told about advances in power optimization and estimation, struggles with leakage current at smaller geometries, clock gating, configuration peaks, and a bunch of other hot topics in cool FPGA design. All of these late-breaking developments are wonderful if you already know the starting point. However, many of our readers have pointed out that we could use a little more background. It’s not that exciting to find out that leakage current has been reduced by 50% if you don’t know what the leakage current was to begin with, whether it was a problem, and what direction it was going before.

So, for those people, we present our FPGA Power Primer.  This article should give you a nice base map for your exploration of the power landscape in FPGAs.  If you’ve done lots of electronic design outside the FPGA world, pay particular attention.  Power in FPGAs probably doesn’t work the way you think.  The rules, assumptions, and results are different (and often counter-intuitive) compared with, say, power consumption in processors.

Power is generally broken down into two categories – dynamic power is the power we use to accomplish actual work (in our case, manipulation of data).  Static power, of course, is the power your system burns just sitting there.  In the old days, we just worried about dynamic power.  The things that make up static power were so small and inconsequential that static power tended to be more of an academic exercise than a real system design consideration.  

Let’s tackle dynamic power first.  Every time a transistor toggles, a tiny bucket of charge is filled and dumped.  We all learned this in EE school (minus the hokey analogy about tiny buckets). The faster you toggle, the more times the tiny charge buckets get emptied and filled, and the more power is consumed.  So – the number of transistors times the number of toggles times the size of the charge bucket equals the amount of dynamic power you’re burning.  Got it?  OK, here’s the quiz. If you have a 36-bit datapath operating at 300MHz and the stimulus data is random, how much power will your FPGA consume?  No idea?  That’s the correct answer.  

You can get an answer for that, however.  All of the FPGA companies offer power estimators that will give you a “back of the envelope” estimate of your design’s power consumption.  Think of this estimator as a spreadsheet, because most of them basically are.  These tools ask you for a little information about your design, multiply that by some coefficients and constants that relate to the FPGA you’ve chosen, and give you an answer.  Typically, that answer will be within +/- 30% of the actual power consumption you’ll experience with your design.  

In the normal world of Moore’s Law, dynamic power gets better with each process generation.  This is what we’ve come to expect.  As the transistors get smaller, the tiny buckets of charge get tinier, and that means less power.  Typically, smaller geometries mean lower supply voltages, and smaller voltage swings lead to less dynamic power consumption.  That’s simple enough, right?  Buy a newer chip, burn less power.  

OK, not quite.  

Also, with each new process generation, we get more transistors on the chip.  More transistors equals more tiny buckets and more power.  In addition, we usually get more speed with each new process generation.  That means we’re filling up the buckets more times per second, and that means more power again.  (Is anybody tired of the tiny buckets yet?  Ok, we promise to stop.)  These factors come into play, however, only if you actually use the additional capabilities of the newer process generation.  If you had the same design running at the same clock frequency, you’d just get the benefits of lower power.  Unfortunately, we know how you roll.  The first thing you know, you’ll be adding more junk to your design, clocking it faster, and using up all those power gains with feature and performance creep.

The FPGA companies have to worry about all these things, however.  When they look at their old largest chip and their new largest chip, they REALLY don’t want the new one to get any hotter than the old one did.  Bad things happen.  What kind of bad things?  First, journalists write articles about how the new chip burns more power than the old chip.  We often don’t dive down into details like double the density and double the clock frequency.  We just say “Hey, the new chip burns more power than the old one,” and leave it at that.  It’s worse than that, though.  As the whole chip burns more power, more heat is generated, and you end up needing heat sinks and bigger power supplies, or maybe the chip gets so hot that you get thermal runaway and a nice puff of blue smoke.  All of these are good reasons for the FPGA company to take worst-case scenarios into account when they design a new, larger, faster chip on a new process generation.  

So we mentioned that, in the old days, we just worried about dynamic power.  However, as our transistors get smaller, they also get leakier.  A small amount of current leaks through the transistor, even when it’s off.  With each process node, the transistors get smaller, and each one leaks more current.  We’re also exponentially increasing the number of transistors, so we end up with more and leakier transistors overall.  A few process generations ago (back when we talked about microns instead of nanometers), the leakage current was a tiny percentage of overall power consumption in a working device.  As transistors got smaller, however, and dynamic power decreased kinda’ linearly with each process generation, and static power increased more or less exponentially with each process generation, you can guess what happened.  Static power got really important really fast.  

This is a particularly big problem for FPGA companies.

Why?  Because in reality, most of the transistors in an FPGA are involved with the configuration logic – the routing and other overhead in making the device programmable.  In an ASIC or a microprocessor or just about any other kind of chip – most of the transistors are part of the active design, but an FPGA is crammed full of transistors that are quietly doing other things like connecting point A to point B in our design.  Typical estimates are that an FPGA employs maybe 10x the transistors for any given function compared with an ASIC.  With 10x the transistors out there leaking current, FPGA people started worrying about leakage current well before anyone else in the class.

In dynamic power, we had offsetting effects from smaller geometries – transistors got more efficient, but we had more of them running faster.  In static power – all the numbers work against us.  We have leakier transistors.  We have more transistors.  Oh, and as transistors get hotter – they leak even more.  Thus leakage begets more leakage, which could, if designers were not careful, lead to a condition known as thermal runaway, resulting in – you guessed it – blue smoke.

Luckily, some pretty smart people are designing FPGAs these days.  New techniques at the fabrication and process level are applied to reduce leakage current.  Some companies use varying oxide thicknesses for different transistors depending on the application.  Some companies have smart cells that turn themselves off (with help from the design tools) when not in use.  Some have programmable back-biasing.  Using these tricks and more, FPGA companies have managed to keep static power at bay – even dropping it from generation to generation – as we’ve arrived at the current 40/45nm process node.  

Now, let’s get back to that estimation problem.  You probably need something more accurate than the 30% or so that most “estimator” tools can give.  In order to get a more accurate power estimate,  we need more information.  We need an accurate description of your design, the target frequencies, and the actual typical stimulus your design will be processing.  The stimulus part is important.  Often, we construct test benches for our design that consist mainly of pathological corner cases – testing out the boundaries of our design.  These stimuli do not make for good power estimation since what we want is something that approximates “typical” operation.  However, with a good set of stimuli, a lot of tools can give you an estimate with single-digit accuracy.  Many of them can even bypass the stimulus requirement and give you pretty good results with the equivalent of randomly-generated patterns.

Finally, don’t forget to use the big advantage of FPGAs – you can prototype and change them quickly, right on your desktop.  It’s a fairly trivial matter to program your development board with your design, measure the power consumption (using a real meter and everything!), and test out a variety of design scenarios with power consumption measured in actual hardware.  Try that with your ASIC!

Leave a Reply

featured blogs
Jul 29, 2021
Circuit checks enable you to analyze typical design problems, such as high impedance nodes, leakage paths between power supplies, timing errors, power issues, connectivity problems, or extreme rise... [[ Click on the title to access the full blog on the Cadence Community sit...
Jul 29, 2021
Learn why SoC emulation is the next frontier for power system optimization, helping chip designers shift power verification left in the SoC design flow. The post Why Wait Days for Results? The Next Frontier for Power Verification appeared first on From Silicon To Software....
Jul 28, 2021
Here's a sticky problem. What if the entire Earth was instantaneously replaced with an equal volume of closely packed, but uncompressed blueberries?...
Jul 9, 2021
Do you have questions about using the Linux OS with FPGAs? Intel is holding another 'Ask an Expert' session and the topic is 'Using Linux with Intel® SoC FPGAs.' Come and ask our experts about the various Linux OS options available to use with the integrated Arm Cortex proc...

featured video

Accelerate Intelligent SLAM with DesignWare ARC EV Processor IP

Sponsored by Synopsys

Simultaneous localization and mapping (SLAM) algorithms build a map and determine location in the map at the same time. But how can you speed up the results? This demo shows how ARC EV processor IP with CNN engine accelerates KudanSLAM algorithms.

Click here for more information about DesignWare ARC EV Processors for Embedded Vision

featured paper

Harnessing the Power of Data to Enhance Quality of Life for Seniors

Sponsored by Maxim Integrated

This customer testimonial highlights the CarePredict digital health platform. Its main device, the Tempo wearable, uses artificial intelligence to derive actionable insights to enhance care and quality of life for seniors.

Click to read more

featured chalk talk

Meet the Latest Wireless Member of the DARWIN Family

Sponsored by Mouser Electronics and Maxim Integrated

May 21, 2021 -- Your next MCU needs to be more than just smart. It needs to be power-efficient, have ample memory, and industrial-grade security. In this episode of Chalk Talk, Amelia Dalton chats with Zach Metzinger of Maxim Integrated about the latest member of the DARWIN family with a new RISC-V co-processor.

Click here for more information about Maxim Integrated MAX32655 Low-Power Wireless Microcontroller