feature article
Subscribe Now

Making FPGAs Cool Again – Part 1

It was a demonstration that buzzed around the (admittedly small incestuous) industry. A digital clock being powered by a grapefruit. It’s the kind of thing you might see on the Discovery Channel these days, but back in the day, Phillips/Signetics created a local stir with their comparatively ultra-low-power CPLDs. At that time, it was competing with the incumbent PALs that drew about 180 mA of current (and competing with their half- and quarter-power versions made possible by CMOS encroaching on an erstwhile bipolar domain). As the smaller devices became commoditized, the spotlight moved to FPGAs, and data sheets stopped including ICC as a parameter with a hard limit. The official (and actually an accurate) explanation was that the amount of current drawn was too design-dependent, and that a “global” maximum current would be way way higher than anything a real design would experience. The more skeptical engineers were suspicious that it was just a way not to be accountable for power.

Whatever the reason, power became a non-issue for many years. FPGAs were going into communications and other rapidly-evolving systems for the purpose of speeding product to market. Systems designers were trading off power, cost, and even the ultimate possible performance in order to be able to ship their products as early as they could. Time to market was king; for everything else, good enough was good enough.

Fast forward to today, and things have changed. We no longer have a bubble market with machines rushing to market no matter the cost. The FPGA arena is dominated by two behemoths with two choices: continue bloodying each other or find new places to sell FPGAs. In fact, they’ve decided to do both. That means, among other things, cheaper product families – something they’ve been paying attention to for a few years now – and now, lower power. Meanwhile, the few remaining smaller guys see power as a way to differentiate and make some noise (and, they hope, inroads). So now the attention being paid to power varies, from Xilinx taking more account of it in their designs and making some tools accommodations, to Altera incorporating specific power features, to Actel dedicating entire families and significant marketing effort to low power.

The reasons for paying attention to power are many. Large new markets require low power – your grandfather’s FPGA would turn a cell phone into a pocket warmer – for a couple milliseconds, anyway. Aggressive technologies have threatened to send power completely out of control, so it’s taken extra effort to reign that in just to stay even. And the green movement, while moderate in this realm, is raising awareness at least of the fact that energy has a cost.

So in this pair of articles, we’ll look at ways that you can reduce power consumption in FPGA designs. First we’ll look at hardware considerations, and we’ll follow that in the next installment with the ways in which the tools enable power savings.

Reducing power

The first thing to look at is the underlying silicon and technology, independent of any special features or software. How an FPGA manufacturer addresses power depends a lot on the technology they’re using. Things really heated up at the 90-nm node, so companies that aren’t there yet have been off the hook (in the old-school sense of being let off the hook, not in the new-school sense of a party that was off the hook). The company making the most noise about power, Actel, is still using 130-nm technology. They haven’t yet been forced into most of the more aggressive silicon design techniques, but in their low-power families they have already taken care to use higher-VT transistors to keep leakage down. Above and beyond that, they simply went through their circuits to find places where they could reduce current using basic blocking and tackling techniques, since it hadn’t been a high-level priority in the past. The result for their lowest-power Igloo family is normal-mode static power as low as 25 µW, dramatically lower than would be typical in an FPGA; they claim battery life 10 times what their competitors can provide.

Xilinx and Altera have long since come to grips with aggressive technology, and the 90-nm node was where they started their low-k-dielectric-vs.-triple-ox war. This was the node where everyone was saying that power would blow up and the end of the line was near. (Ever notice that there’s a general end-of-days sensibility in technology? How many times has the death of CMOS been predicted? We’re still keeping our Kool-Aid powder dry…) Altera’s publicly-touted answer was to use copper interconnect with a low-k inter-interconnect dielectric. The low-k nature (actually, apparently it’s “low-κ” – that’s “kappa”) simply means that the coupling ratio is reduced, making it harder for the metal layers to induce crosstalk on each other, reducing dynamic current. These innovations were actually creeping in before the 90-nm node, but that’s where the marketing hoopla started. Low-k dielectric and copper have become standard since then, and are also used by Xilinx.
Meanwhile, Xilinx ballyhooed their use of so-called triple-ox. This basically meant that they had three different transistors with three different gate oxide thicknesses giving three different VTs. The higher-VT transistors were slower but less leaky. Selective usage of the different transistors allowed better allocation and optimization of power. This has also become standard, and Altera uses it as well.

So that specific battle is pretty much over. A truce has, perhaps unknowingly and certainly begrudgingly, been called, and both companies use low-k, copper, multi-VT, and other standard power-reducing techniques. Xilinx has raised power-awareness in their overall design criteria; the Multi-Gigabit Tranceivers (MGTs) have been redesigned with 80% lower power, making it easier for them to be added to more of their Virtex 5 devices. Altera has focused more explicit power-reducing architecture and design effort on the higher-end Stratix family, while relying on an inherently lower-power TSMC process to keep the power of the lower-cost Cyclone family in bounds.

Relinquishing power

When it comes to specific hardware features that allow you to make your own power tradeoffs, there is more of a range. Xilinx puts some potential functions in hard-wired blocks, with one resulting benefit potentially being lower power. Their hard Ethernet MAC is an example of this. The use of a larger DSP block in Virtex 5 means that one block can now do the work that would have required two blocks before, and they claim that this should reduce power. In addition, if you’re using the DSP48 block, the choice of pipeline register can make a difference, with the PREG choice using more power than the MREG choice.

Altera has taken things a step further on the Stratix III family by providing two power levels to their Logic Array Blocks (LABs). This is based on their observation that the vast majority of LABs in a typical design have plenty of slack and can actually be slowed down without compromising the overall performance of the circuit. Unused LABs can also be put into low power mode. Each LAB (as well as each flip-flop in a LAB) can have its own clock enable, which allows an entire LAB to have its clock shut down if no registers are used.

Actel and Lattice have gone yet further by providing a way to power down the entire chip dynamically. Lattice’s MachXO family has a sleep mode that can be entered by asserting a SLEEPN pin. Sleep mode can reduce current by two orders of magnitude, possibly taking it below 100 µA. Power remains on the device during sleep mode, I/Os are tristated, but the contents of internal registers are lost. Actel has a similar FlashFreeze feature on their Igloo and ProASIC3L families. By asserting the Flash*Freeze pin, you can get power down to 5 µW on some devices while power remains on and the external clocks and other inputs remain active. Outputs are tristated and core registers and SRAM states are maintained.

Reprocessing power

There’s one other design choice that can affect power consumption, although deciding the right course isn’t trivial. In fact, it’s a choice that applies to SoC design as well: how should functionality be divided between hardware and software? FPGAs are seen mostly as hardware devices and are often used to accelerate functions that go too slowly in a processor. But as FPGA sizes have increased, implementing one or more processors has become more viable.

Processors in FPGAs come in two flavors, or perhaps better to say two textures: hard and soft. The only remaining one that’s cast into hard transistors is the PowerPC provided on some Virtex devices from Xilinx. Everyone, Xilinx included, has at least one soft processor core. Actel has their CoreMP7 and Cortex-M1, which are soft versions of ARM processors; Altera has Nios, which was the first of the soft cores; Lattice has their open-source Mico32; and Xilinx has MicroBlaze. Of these, all but the ARM core are proprietary.

So now the question can be asked: for a function inside the FPGA, assuming you can get the performance you need in software or hardware – not a trivial assumption since most soft cores aren’t particularly zippy – do you get lower power by building a processor and executing software or by building straight hardware? There’s no easy answer to this one. It’s extremely dependent on the function and performance. Hardware acceleration is often achieved through parallelization, which can create a lot of hardware. In an SoC, more hardware definitely means more transistors leaking, but in an FPGA the transistors are there whether you use them or not. So unless you can power down unused cells, the issue really becomes the number of switching signals, and parallel circuits would tend to have more of those than a processor would. Sometimes. Even often. Depends on the size of the hardware circuit. Yeah, it’s like that. If you really want to compare the two, you’ll need to implement both versions and pick the lower-power one. Needless to say, most hardware designers simply take the obvious choice: design hardware. No need to have some pesky software engineers coming and meddling in affairs where they don’t belong…

Now that we’ve looked at the state of FPGA power-reducing hardware, the real issue becomes how to take advantage of that hardware in your designs. This is the role of tools: no silicon feature has value if the tools can’t take advantage of it. We’ll address this topic in the second part of this article, to be delivered in a couple weeks.

Leave a Reply

featured blogs
Feb 28, 2021
Using Cadence ® Specman ® Elite macros lets you extend the e language '”€ i.e. invent your own syntax. Today, every verification environment contains multiple macros. Some are simple '€œsyntax... [[ Click on the title to access the full blog on the Cadence Comm...
Feb 27, 2021
New Edge Rate High Speed Connector Set Is Micro, Rugged Years ago, while hiking the Colorado River Trail in Rocky Mountain National Park with my two sons, the older one found a really nice Swiss Army Knife. By “really nice” I mean it was one of those big knives wi...
Feb 26, 2021
OMG! Three 32-bit processor cores each running at 300 MHz, each with its own floating-point unit (FPU), and each with more memory than you than throw a stick at!...

featured video

Designing your own Processor with ASIP Designer

Sponsored by Synopsys

Designing your own processor is time-consuming and resource intensive, and it used to be limited to a few experts. But Synopsys’ ASIP Designer tool allows you to design your own specialized processor within your deadline and budget. Watch this video to learn more.

Click here for more information

featured paper

Authenticating Remote Automotive Peripherals Using GMSL Tunneling

Sponsored by Maxim Integrated

Authentication can be applied to automotive environments to protect peripheral components from third-party counterfeits. This application note details how to implement automotive authentication with the use of gigabit multimedia serial link (GMSL).

Click here to download the whitepaper

Featured Chalk Talk

Intel NUC Elements

Sponsored by Mouser Electronics and Intel

Intel Next Unit of Computing (NUC) compute elements are small-form-factor barebone computer kits and components that are perfect for a wide variety of system designs. In this episode of Chalk Talk, Amelia Dalton chats with Kristin Brown of Intel System Product Group about pre-engineered solutions from Intel that can provide the appropriate level of computing power for your next design, with a minimal amount of development effort from your engineering team.

Click here for more information about Intel NUC 8 Compute Element (U-Series)