feature article
Subscribe Now


The Outer Ring of FPGA Architecture

Last week we examined the legacy of the LUT – the basic building block that defines the very fabric of FPGAs.  Surprisingly, however, the primary driver of attributes such as cost, power consumption, and utility in FPGAs is not the fabric itself, but the choice of I/O for the device.  You see, while the internal logic keeps shrinking, some of the I/O structures don’t really scale well – things like bonding pads and higher-current transistors don’t track Moore’s Law, so the cost of an individual I/O compared with an amount of core logic keeps increasing with every product generation.

In custom IC design, devices have traditionally been either “pad limited” or “core limited.”  I/O pads had to be lined up along the periphery of the die to facilitate wire bonding.  If all of your I/Os lined up along the edge made the die bigger than it needed to be to accommodate your core logic, your design was “pad limited.”  If, on the other hand, the core logic left room for the required number of I/Os around the periphery with room to spare (stop chuckling out there, IC designers), your design was “core limited.”  The reason for all that laughter at the back of the room is that the idea of “room to spare” was typically a theoretical condition that seldom arose in real-world practice.

Over time, improved bonding techniques allowed multiple rows of I/O to be placed around the device – improving the design options on balancing core with I/O.  For devices like FPGAs, however, it never makes sense to waste either core or I/O area.  FPGAs have typically been designed to strike a balance between core and I/O that covers the largest cross-section of design needs without wasting silicon real-estate.  Xilinx, for example, has multiple versions of their Spartan-3 series with one ring of I/Os on devices designed for more core-intensive applications and two-ring versions for applications that have greater I/O needs.

In recent years, a major innovation has begun to change the tradeoff space between I/O and core logic.  Flip chip technology rewrites the rules of IC design when it comes to I/O versus core tradeoffs.  Instead of traditional wire bonding to get signals from I/O pads to package pins, flip chip places solder balls directly on the chip pads during the final stages of wafer fabrication.  The very cool part is that the die is inverted into the package, bringing the solder balls into contact with connectors in the package (or even sometimes on a circuit board).  The solder is melted and the connection is made completely without wire bonding. 

The flip chip approach has numerous advantages.  First, I/O pads can be placed basically anywhere on the device.  This completely eliminates the core versus I/O tradeoff above.  It also drastically reduces the capacitance, delay, and signal integrity issues associated with bonding wires.  Additionally – on the die, the I/O pads can be located much closer to the internal circuitry that connects to them.  The additional routing area, capacitance, and delay usually associated with getting from the middle of the device (or even the opposite side) to the I/O ring is significantly improved.  Also – silicon IP can now be developed in an entirely different way.  IP that requires both core logic and dedicated I/O can theoretically be designed as a physically contiguous unit.  In the past, the I/O portions were on the periphery and the core portions in the middle, and placement and routing software was charged with connecting the two in an acceptable fashion.

In FPGAs, these benefits translate into smoother tradeoff curves between I/O and core resources, more scalable fabric design (because specific types of fabric can be directly paired with required I/O types), and the ability to mix-and-match (at the FPGA vendor level) re-usable sections of devices – allowing for more application-specific diversification of feature mixes.  The technology also brings benefits like better signal integrity for SerDes, more rigid construction for hostile environments, and (theory has it) eventually lower manufacturing costs.  Today, however, the cost is still higher than for wire-bond packaging – leading to low-cost FPGAs remaining wire-bond while the high-end devices migrate to flip-chip.

Speaking of feature mixes, in the world of FPGAs, figuring out how many I/Os you need per core logic and where to put them is only the first part of the battle.  In normal custom IC design, you pick each I/O device based on the requirements of that specific connection.  Drive currents, voltages, fanout/fanin capability, tri-stating, clock-recovery, timing performance, and advanced capabilities like LVDS and SerDes can be judiciously added to the design only where and when they’re required.  In FPGAs, however, you have to select the mix of features without knowing what the design requires in advance. 

Every FPGA architect has to examine large numbers of “typical” customer designs and then strike a balance in the number and types of I/O features included.  Initially, it might seem that you would want to construct some kind of super-I/O that is capable of meeting most needs from a single piece of hardware.  However, such a “Swiss Army Knife” I/O buffer becomes prohibitively expensive and complex and, like a Swiss Army Knife, does many things, but none very well.   Thus designers are faced with sprinkling not only the right mix of I/O features into the base FPGA design, but also placing them in the locations most likely to give good results relative to other IP such as multipliers, memories, processor cores, and other core logic that might need direct outside connections. 

SerDes makes the I/O quandary even more difficult for our friends designing FPGAs.  With standards like PCIe and gigabit Ethernet becoming more and more pervasive, high-speed serial transceivers are likely to become required components in most FPGAs.  Already, we have seen the migration of those mainstream capabilities down from the exotic, highest-of-the-high-end FPGA families like Altera’s Stratix “GX” families and Xilinx’s Virtex “FX” families into low-cost offerings.  Lattice Semiconductor actually led the charge of SerDes into low-cost FPGAs about a year ago, and other vendors are now following suit. 

When adding SerDes to an FPGA, there are still crucial decisions that can have a dramatic impact on the success or failure of the family.  When SerDes first hit the FPGA scene, the trend was to build mega-transceivers that could handle every standard under the sun and – most importantly – supported higher maximum data rates than the competitors’. 

Ultimately, this strategy backfires.  Designing transceivers that can handle every conceivable standard and speed once again makes you fall victim to Swiss Army Knife Syndrome – Jack of all trades, master of none.  This particularly becomes a problem if you have, say, a transceiver that works perfectly for all the mainstream standards, but fails to test and operate properly at some more obscure and less-used data rate that it claims to support.  In this case, you have a device that fails testing – even though it would work perfectly for most mainstream applications.  Your effective yield goes down and takes profits and customer goodwill along with it.

Now, vendors are much more judicious about what standards they attempt to support with FPGA-based SerDes transceivers.  In some cases, hard-wired I/O units support specific standards like PCI Express.  In other cases, vendors keep their claims tame – only setting their sights on standards that hit the majority of design needs without requiring the extreme performance that drives up complexity, reduces yields, and increases testing requirements.  By segregating their transceivers, vendors can build and sell only what customers are demanding – with less focus on the press-release one-upsmanship that comes from claims about total bandwidth and maximum data rates from a single transceiver. 

In the big picture of FPGA feature set and value, it is surprising how much is determined by I/O capabilities instead of core capacity and performance.  As we move to smaller geometries, this effect is likely to only increase – gates will get closer and closer to “free” while the pricing curve will be tightly coupled to a device’s I/O capabilities.

Leave a Reply

featured blogs
Feb 21, 2024
In the dynamic landscape of automotive design, optimizing aerodynamics is key to achieving peak performance, fuel efficiency, vehicle range, and sustainability. Large eddy simulation (LES), a cutting-edge simulation technique, is reshaping how we approach automotive aerodynam...
Feb 15, 2024
This artist can paint not just with both hands, but also with both feet, and all at the same time!...

featured video

Tackling Challenges in 3DHI Microelectronics for Aerospace, Government, and Defense

Sponsored by Synopsys

Aerospace, Government, and Defense industry experts discuss the complexities of 3DHI for technological, manufacturing, & economic intricacies, as well as security, reliability, and safety challenges & solutions. Explore DARPA’s NGMM plan for the 3DHI R&D ecosystem.

Learn more about Synopsys Aerospace and Government Solutions

featured paper

How to Deliver Rock-Solid Supply in a Complex and Ever-Changing World

Sponsored by Intel

A combination of careful planning, focused investment, accurate tracking, and commitment to product longevity delivers the resilient supply chain FPGA customers require.

Click here to read more

featured chalk talk

GaN Solutions Featuring EcoGaN™ and Nano Pulse Control
In this episode of Chalk Talk, Amelia Dalton and Kengo Ohmori from ROHM Semiconductor examine the details and benefits of ROHM Semiconductor’s new lineup of EcoGaN™ Power Stage ICs that can reduce the component count by 99% and the power loss of your next design by 55%. They also investigate ROHM’s Ultra-High-Speed Control IC Technology called Nano Pulse Control that maximizes the performance of GaN devices.
Oct 9, 2023