feature article
Subscribe Now

Semi-Programmable

New Architectures Optimize the Mix

It stands to reason.

Some components of system-on-chip design are static. You’re not going back and re-engineering them every two weeks. The multiplier was designed long ago and doesn’t really need to be designed again every time the moon changes phase. Neither does the PCI core, for that matter. They’re both stable and well-debugged. It’s unlikely that you’re ever going to need to modify or reconfigure them.

Why, then, does it make sense for these common functions to be built out of programmable logic, subject to the performance, area, and power penalties of LUT-based implementations, and at risk for the random timing and layout problems that can creep into large FPGA designs with soft macros? Of course, it does not.

The major FPGA vendors figured this out some time ago and began putting the very common functions (like multipliers) in hard, cell-based-like implementations on their FPGAs. For a small silicon investment, stable functions could be accelerated to ASIC performance and power, leaving more LUTs for the logic that needed them. This architectural addition makes good sense and has become an accepted feature of most high- (and now even some low-) end FPGAs.

Multipliers are all fine and good, but wouldn’t the same reasoning lead one to hard implementations of larger, more complex, stable functions? It turns out that this line of logic leads us down a slippery slope. Every time an FPGA vendor builds another hard macro onto their FPGA, they make life better for the people who use it, but worse for the people who don’t. When Xilinx first introduced Virtex II Pro, many (maybe the majority) of early users didn’t take advantage of the PowerPC cores lovingly laid out amidst the LUT fabric. For them, the processors were just so much wasted cost and chip area that could have been put to better use in their application.

The more goodies a vendor packs on, the narrower they make the optimal audience for the device. Xilinx has fought back against this problem with the ASMBL (Advanced Silicon Modular BLock) architecture of their newly announced Virtex-4 series. The new architecture allows designers to choose from four different mixtures of “special” features on their FPGA according to their needs. The PowerPC is included in only one of those mixes. This moves things in the right direction, but choosing the “vegetarian,” “meat-lover’s,” or “4-cheese” pizza won’t let you create exactly the “half-pepperoni, half-mushroom, sauce-on-the-side, mozzarella-and-cheddar” concoction that you sometimes actually crave.

Alternatively, Altera has gone the structured ASIC route with its HardCopy solution. HardCopy allows an FPGA design to be re-spun into a completely mask-programmed implementation for higher performance, lower-power, and substantially lower cost. The penalty is a small NRE, a few weeks of cycle time, and loss of re-programmability.

What if you could have your cake and eat it too? What if you could implement any function you wanted in mask-programmable logic, and save the inefficiencies of programmability for the portions of your design that really need it?

Leopard Logic is offering just that with their newly announced “Gladiator” series. While using Gladiator does entail the dreaded “handoff” and “NRE” components that FPGA designers disdain, Leopard Logic has worked to make these feared phases as easy and inexpensive as possible. What Gladiator does offer is both high-performance mask-programmed and LUT-based programmable fabrics on the same chip. For the portions of your design that are stable, unchanging, and fixed, you can leverage the mask-programmed portions of the fabric for performance, power, and area efficiency. For the pieces of your design that vary, you have LUT-based logic that can be configured and re-configured on-the-fly. With a little forethought, you can build your own, super-customized FPGA-like platform with your legacy IP nicely tucked away and working in the mask-customized section, then use the programmable fabric to create variants for multiple products, or to update as standards and protocols change.

Gladiator’s core logic comes in two flavors, HyperBlox MP (metal-programmed with a single via layer) and HyperBlox FP (field-programmable using FPGA-style SRAM structures). They’ve also gone for full cell-based implementations of Multiply-Accumulate (MAC) blocks and RAM for maximum efficiency in these commonly used functions. The hybrid mixture is what Leopard Logic calls a “CLD” or “Configurable Logic Device”. CLDs differ from structured ASICs in that they offer FPGA-like programmable logic fabric along with the mask-customized portion. They truly represent a middle ground between ASIC and FPGA.

Gladiator aims at a spot near the high-end of FPGA with substantially lower unit-costs and improved power and performance. The top-end of the Gladiator family, the CLD25000 will boast over 25 million “System Gates” with 256K mask-programmed cells, 16K FPGA cells, 256 each of the 36K DPRAM and 18X18MAC blocks, and 16 PLL/DLLs. The smallest member of the family, the CLD1600 is rated at about 1.6M system gates with proportionally fewer of each feature.

Leopard Logic claims that the NRE has been brought down to around $50K(USD) for a single-layer mask required for the via-based metal customization, and a snappy 4-week lead time. While this may be a slight inconvenience compared to an FPGA-based implementation, the benefits are substantial and many design teams may be happy with the trade-off.

The Gladiator design flow is built around a simple-to-use “ToolBlox” cockpit integrating industry-standard tools like those from Synopsys and Mentor Graphics. Leopard Logic also has worked with IP vendors to offer a library of ready-to-use logic blocks for systems designers starting out with Gladiator-based designs.

As a proof-of-concept application, Leopard Logic created a control plane PowerPC bridge for network, storage, and wireless applications. The design used IP from multiple vendors implemented in the mask-programmed fabric, and used the programmable fabric for IP-IP interfaces (where most problems usually occur) as well as Ethernet MAC bus interface and other newly designed blocks that might require changes. They claim that the resulting solution gives superior performance and price with minimal design effort, risk, and NRE overhead.

While the popularity of hybrid solutions such as Gladiator remains to be seen, the fact remains that there is a vast, underserved gulf between cell-based ASIC and programmable logic that should become a dynamic and lucrative market in the next few years. The innovation that will eventually fill that gap is probably just beginning.

Leave a Reply

featured blogs
May 21, 2022
May is Asian American and Pacific Islander (AAPI) Heritage Month. We would like to spotlight some of our incredible AAPI-identifying employees to celebrate. We recognize the important influence that... ...
May 20, 2022
I'm very happy with my new OMTech 40W CO2 laser engraver/cutter, but only because the folks from Makers Local 256 helped me get it up and running....
May 19, 2022
Learn about the AI chip design breakthroughs and case studies discussed at SNUG Silicon Valley 2022, including autonomous PPA optimization using DSO.ai. The post Key Highlights from SNUG 2022: AI Is Fast Forwarding Chip Design appeared first on From Silicon To Software....
May 12, 2022
By Shelly Stalnaker Every year, the editors of Elektronik in Germany compile a list of the most interesting and innovative… ...

featured video

Synopsys PPA(V) Voltage Optimization

Sponsored by Synopsys

Performance-per-watt has emerged as one of the highest priorities in design quality, leading to a shift in technology focus and design power optimization methodologies. Variable operating voltage possess high potential in optimizing performance-per-watt results but requires a signoff accurate and efficient methodology to explore. Synopsys Fusion Design Platform™, uniquely built on a singular RTL-to-GDSII data model, delivers a full-flow voltage optimization and closure methodology to achieve the best performance-per-watt results for the most demanding semiconductor segments.

Learn More

featured paper

Intel Agilex FPGAs Deliver Game-Changing Flexibility & Agility for the Data-Centric World

Sponsored by Intel

The new Intel® Agilex™ FPGA is more than the latest programmable logic offering—it brings together revolutionary innovation in multiple areas of Intel technology leadership to create new opportunities to derive value and meaning from this transformation from edge to data center. Want to know more? Start with this white paper.

Click to read more

featured chalk talk

IsoMOV

Sponsored by Mouser Electronics and Bourns

Today, your circuit protection device needs to be versatile, handling a wide range of conditions with long-life low capacitance, low leakage, and state-of-the-art energy handling density. In this episode of Chalk Talk, Amelia Dalton chats with Paul Smith from Bourns about IsoMOV - a new integrated circuit protection that brings together the most important circuit protection capabilities in one efficient package.

Click here for more information about Bourns IsoMOV™ Series Hybrid Protection Component