feature article
Subscribe Now

Making FPGAs Cool Again – Part 1

It was a demonstration that buzzed around the (admittedly small incestuous) industry. A digital clock being powered by a grapefruit. It’s the kind of thing you might see on the Discovery Channel these days, but back in the day, Phillips/Signetics created a local stir with their comparatively ultra-low-power CPLDs. At that time, it was competing with the incumbent PALs that drew about 180 mA of current (and competing with their half- and quarter-power versions made possible by CMOS encroaching on an erstwhile bipolar domain). As the smaller devices became commoditized, the spotlight moved to FPGAs, and data sheets stopped including ICC as a parameter with a hard limit. The official (and actually an accurate) explanation was that the amount of current drawn was too design-dependent, and that a “global” maximum current would be way way higher than anything a real design would experience. The more skeptical engineers were suspicious that it was just a way not to be accountable for power.

Whatever the reason, power became a non-issue for many years. FPGAs were going into communications and other rapidly-evolving systems for the purpose of speeding product to market. Systems designers were trading off power, cost, and even the ultimate possible performance in order to be able to ship their products as early as they could. Time to market was king; for everything else, good enough was good enough.

Fast forward to today, and things have changed. We no longer have a bubble market with machines rushing to market no matter the cost. The FPGA arena is dominated by two behemoths with two choices: continue bloodying each other or find new places to sell FPGAs. In fact, they’ve decided to do both. That means, among other things, cheaper product families – something they’ve been paying attention to for a few years now – and now, lower power. Meanwhile, the few remaining smaller guys see power as a way to differentiate and make some noise (and, they hope, inroads). So now the attention being paid to power varies, from Xilinx taking more account of it in their designs and making some tools accommodations, to Altera incorporating specific power features, to Actel dedicating entire families and significant marketing effort to low power.

The reasons for paying attention to power are many. Large new markets require low power – your grandfather’s FPGA would turn a cell phone into a pocket warmer – for a couple milliseconds, anyway. Aggressive technologies have threatened to send power completely out of control, so it’s taken extra effort to reign that in just to stay even. And the green movement, while moderate in this realm, is raising awareness at least of the fact that energy has a cost.

So in this pair of articles, we’ll look at ways that you can reduce power consumption in FPGA designs. First we’ll look at hardware considerations, and we’ll follow that in the next installment with the ways in which the tools enable power savings.

Reducing power

The first thing to look at is the underlying silicon and technology, independent of any special features or software. How an FPGA manufacturer addresses power depends a lot on the technology they’re using. Things really heated up at the 90-nm node, so companies that aren’t there yet have been off the hook (in the old-school sense of being let off the hook, not in the new-school sense of a party that was off the hook). The company making the most noise about power, Actel, is still using 130-nm technology. They haven’t yet been forced into most of the more aggressive silicon design techniques, but in their low-power families they have already taken care to use higher-VT transistors to keep leakage down. Above and beyond that, they simply went through their circuits to find places where they could reduce current using basic blocking and tackling techniques, since it hadn’t been a high-level priority in the past. The result for their lowest-power Igloo family is normal-mode static power as low as 25 µW, dramatically lower than would be typical in an FPGA; they claim battery life 10 times what their competitors can provide.

Xilinx and Altera have long since come to grips with aggressive technology, and the 90-nm node was where they started their low-k-dielectric-vs.-triple-ox war. This was the node where everyone was saying that power would blow up and the end of the line was near. (Ever notice that there’s a general end-of-days sensibility in technology? How many times has the death of CMOS been predicted? We’re still keeping our Kool-Aid powder dry…) Altera’s publicly-touted answer was to use copper interconnect with a low-k inter-interconnect dielectric. The low-k nature (actually, apparently it’s “low-κ” – that’s “kappa”) simply means that the coupling ratio is reduced, making it harder for the metal layers to induce crosstalk on each other, reducing dynamic current. These innovations were actually creeping in before the 90-nm node, but that’s where the marketing hoopla started. Low-k dielectric and copper have become standard since then, and are also used by Xilinx.
Meanwhile, Xilinx ballyhooed their use of so-called triple-ox. This basically meant that they had three different transistors with three different gate oxide thicknesses giving three different VTs. The higher-VT transistors were slower but less leaky. Selective usage of the different transistors allowed better allocation and optimization of power. This has also become standard, and Altera uses it as well.

So that specific battle is pretty much over. A truce has, perhaps unknowingly and certainly begrudgingly, been called, and both companies use low-k, copper, multi-VT, and other standard power-reducing techniques. Xilinx has raised power-awareness in their overall design criteria; the Multi-Gigabit Tranceivers (MGTs) have been redesigned with 80% lower power, making it easier for them to be added to more of their Virtex 5 devices. Altera has focused more explicit power-reducing architecture and design effort on the higher-end Stratix family, while relying on an inherently lower-power TSMC process to keep the power of the lower-cost Cyclone family in bounds.

Relinquishing power

When it comes to specific hardware features that allow you to make your own power tradeoffs, there is more of a range. Xilinx puts some potential functions in hard-wired blocks, with one resulting benefit potentially being lower power. Their hard Ethernet MAC is an example of this. The use of a larger DSP block in Virtex 5 means that one block can now do the work that would have required two blocks before, and they claim that this should reduce power. In addition, if you’re using the DSP48 block, the choice of pipeline register can make a difference, with the PREG choice using more power than the MREG choice.

Altera has taken things a step further on the Stratix III family by providing two power levels to their Logic Array Blocks (LABs). This is based on their observation that the vast majority of LABs in a typical design have plenty of slack and can actually be slowed down without compromising the overall performance of the circuit. Unused LABs can also be put into low power mode. Each LAB (as well as each flip-flop in a LAB) can have its own clock enable, which allows an entire LAB to have its clock shut down if no registers are used.

Actel and Lattice have gone yet further by providing a way to power down the entire chip dynamically. Lattice’s MachXO family has a sleep mode that can be entered by asserting a SLEEPN pin. Sleep mode can reduce current by two orders of magnitude, possibly taking it below 100 µA. Power remains on the device during sleep mode, I/Os are tristated, but the contents of internal registers are lost. Actel has a similar FlashFreeze feature on their Igloo and ProASIC3L families. By asserting the Flash*Freeze pin, you can get power down to 5 µW on some devices while power remains on and the external clocks and other inputs remain active. Outputs are tristated and core registers and SRAM states are maintained.

Reprocessing power

There’s one other design choice that can affect power consumption, although deciding the right course isn’t trivial. In fact, it’s a choice that applies to SoC design as well: how should functionality be divided between hardware and software? FPGAs are seen mostly as hardware devices and are often used to accelerate functions that go too slowly in a processor. But as FPGA sizes have increased, implementing one or more processors has become more viable.

Processors in FPGAs come in two flavors, or perhaps better to say two textures: hard and soft. The only remaining one that’s cast into hard transistors is the PowerPC provided on some Virtex devices from Xilinx. Everyone, Xilinx included, has at least one soft processor core. Actel has their CoreMP7 and Cortex-M1, which are soft versions of ARM processors; Altera has Nios, which was the first of the soft cores; Lattice has their open-source Mico32; and Xilinx has MicroBlaze. Of these, all but the ARM core are proprietary.

So now the question can be asked: for a function inside the FPGA, assuming you can get the performance you need in software or hardware – not a trivial assumption since most soft cores aren’t particularly zippy – do you get lower power by building a processor and executing software or by building straight hardware? There’s no easy answer to this one. It’s extremely dependent on the function and performance. Hardware acceleration is often achieved through parallelization, which can create a lot of hardware. In an SoC, more hardware definitely means more transistors leaking, but in an FPGA the transistors are there whether you use them or not. So unless you can power down unused cells, the issue really becomes the number of switching signals, and parallel circuits would tend to have more of those than a processor would. Sometimes. Even often. Depends on the size of the hardware circuit. Yeah, it’s like that. If you really want to compare the two, you’ll need to implement both versions and pick the lower-power one. Needless to say, most hardware designers simply take the obvious choice: design hardware. No need to have some pesky software engineers coming and meddling in affairs where they don’t belong…

Now that we’ve looked at the state of FPGA power-reducing hardware, the real issue becomes how to take advantage of that hardware in your designs. This is the role of tools: no silicon feature has value if the tools can’t take advantage of it. We’ll address this topic in the second part of this article, to be delivered in a couple weeks.

Leave a Reply

featured blogs
Aug 15, 2018
https://youtu.be/6a0znbVfFJk \ Coming from the Cadence parking lot (camera Sean) Monday: Jobs: Farmer, Baker Tuesday: Jobs: Printer, Chocolate Maker Wednesday: Jobs: Programmer, Caver Thursday: Jobs: Some Lessons Learned Friday: Jobs: Five Lessons www.breakfastbytes.com Sign ...
Aug 15, 2018
VITA 57.4 FMC+ Standard As an ANSI/VITA member, Samtec supports the release of the new ANSI/VITA 57.4-2018 FPGA Mezzanine Card Plus Standard. VITA 57.4, also referred to as FMC+, expands upon the I/O capabilities defined in ANSI/VITA 57.1 FMC by adding two new connectors that...
Aug 15, 2018
The world recognizes the American healthcare system for its innovation in precision medicine, surgical techniques, medical devices, and drug development. But they'€™ve been slow to adopt 21st century t...
Aug 14, 2018
I worked at HP in Ft. Collins, Colorado back in the 1970s. It was a heady experience. We were designing and building early, pre-PC desktop computers and we owned the market back then. The division I worked for eventually migrated to 32-bit workstations, chased from the deskto...
Jul 30, 2018
As discussed in part 1 of this blog post, each instance of an Achronix Speedcore eFPGA in your ASIC or SoC design must be configured after the system powers up because Speedcore eFPGAs employ nonvolatile SRAM technology to store its configuration bits. The time required to pr...