feature article
Subscribe Now

Semi-Programmable

New Architectures Optimize the Mix

It stands to reason.

Some components of system-on-chip design are static. You’re not going back and re-engineering them every two weeks. The multiplier was designed long ago and doesn’t really need to be designed again every time the moon changes phase. Neither does the PCI core, for that matter. They’re both stable and well-debugged. It’s unlikely that you’re ever going to need to modify or reconfigure them.

Why, then, does it make sense for these common functions to be built out of programmable logic, subject to the performance, area, and power penalties of LUT-based implementations, and at risk for the random timing and layout problems that can creep into large FPGA designs with soft macros? Of course, it does not.

The major FPGA vendors figured this out some time ago and began putting the very common functions (like multipliers) in hard, cell-based-like implementations on their FPGAs. For a small silicon investment, stable functions could be accelerated to ASIC performance and power, leaving more LUTs for the logic that needed them. This architectural addition makes good sense and has become an accepted feature of most high- (and now even some low-) end FPGAs.

Multipliers are all fine and good, but wouldn’t the same reasoning lead one to hard implementations of larger, more complex, stable functions? It turns out that this line of logic leads us down a slippery slope. Every time an FPGA vendor builds another hard macro onto their FPGA, they make life better for the people who use it, but worse for the people who don’t. When Xilinx first introduced Virtex II Pro, many (maybe the majority) of early users didn’t take advantage of the PowerPC cores lovingly laid out amidst the LUT fabric. For them, the processors were just so much wasted cost and chip area that could have been put to better use in their application.

The more goodies a vendor packs on, the narrower they make the optimal audience for the device. Xilinx has fought back against this problem with the ASMBL (Advanced Silicon Modular BLock) architecture of their newly announced Virtex-4 series. The new architecture allows designers to choose from four different mixtures of “special” features on their FPGA according to their needs. The PowerPC is included in only one of those mixes. This moves things in the right direction, but choosing the “vegetarian,” “meat-lover’s,” or “4-cheese” pizza won’t let you create exactly the “half-pepperoni, half-mushroom, sauce-on-the-side, mozzarella-and-cheddar” concoction that you sometimes actually crave.

Alternatively, Altera has gone the structured ASIC route with its HardCopy solution. HardCopy allows an FPGA design to be re-spun into a completely mask-programmed implementation for higher performance, lower-power, and substantially lower cost. The penalty is a small NRE, a few weeks of cycle time, and loss of re-programmability.

What if you could have your cake and eat it too? What if you could implement any function you wanted in mask-programmable logic, and save the inefficiencies of programmability for the portions of your design that really need it?

Leopard Logic is offering just that with their newly announced “Gladiator” series. While using Gladiator does entail the dreaded “handoff” and “NRE” components that FPGA designers disdain, Leopard Logic has worked to make these feared phases as easy and inexpensive as possible. What Gladiator does offer is both high-performance mask-programmed and LUT-based programmable fabrics on the same chip. For the portions of your design that are stable, unchanging, and fixed, you can leverage the mask-programmed portions of the fabric for performance, power, and area efficiency. For the pieces of your design that vary, you have LUT-based logic that can be configured and re-configured on-the-fly. With a little forethought, you can build your own, super-customized FPGA-like platform with your legacy IP nicely tucked away and working in the mask-customized section, then use the programmable fabric to create variants for multiple products, or to update as standards and protocols change.

Gladiator’s core logic comes in two flavors, HyperBlox MP (metal-programmed with a single via layer) and HyperBlox FP (field-programmable using FPGA-style SRAM structures). They’ve also gone for full cell-based implementations of Multiply-Accumulate (MAC) blocks and RAM for maximum efficiency in these commonly used functions. The hybrid mixture is what Leopard Logic calls a “CLD” or “Configurable Logic Device”. CLDs differ from structured ASICs in that they offer FPGA-like programmable logic fabric along with the mask-customized portion. They truly represent a middle ground between ASIC and FPGA.

Gladiator aims at a spot near the high-end of FPGA with substantially lower unit-costs and improved power and performance. The top-end of the Gladiator family, the CLD25000 will boast over 25 million “System Gates” with 256K mask-programmed cells, 16K FPGA cells, 256 each of the 36K DPRAM and 18X18MAC blocks, and 16 PLL/DLLs. The smallest member of the family, the CLD1600 is rated at about 1.6M system gates with proportionally fewer of each feature.

Leopard Logic claims that the NRE has been brought down to around $50K(USD) for a single-layer mask required for the via-based metal customization, and a snappy 4-week lead time. While this may be a slight inconvenience compared to an FPGA-based implementation, the benefits are substantial and many design teams may be happy with the trade-off.

The Gladiator design flow is built around a simple-to-use “ToolBlox” cockpit integrating industry-standard tools like those from Synopsys and Mentor Graphics. Leopard Logic also has worked with IP vendors to offer a library of ready-to-use logic blocks for systems designers starting out with Gladiator-based designs.

As a proof-of-concept application, Leopard Logic created a control plane PowerPC bridge for network, storage, and wireless applications. The design used IP from multiple vendors implemented in the mask-programmed fabric, and used the programmable fabric for IP-IP interfaces (where most problems usually occur) as well as Ethernet MAC bus interface and other newly designed blocks that might require changes. They claim that the resulting solution gives superior performance and price with minimal design effort, risk, and NRE overhead.

While the popularity of hybrid solutions such as Gladiator remains to be seen, the fact remains that there is a vast, underserved gulf between cell-based ASIC and programmable logic that should become a dynamic and lucrative market in the next few years. The innovation that will eventually fill that gap is probably just beginning.

Leave a Reply

featured blogs
Nov 24, 2020
In our last Knowledge Booster Blog , we introduced you to some tips and tricks for the optimal use of the Virtuoso ADE Product Suite . W e are now happy to present you with some further news from our... [[ Click on the title to access the full blog on the Cadence Community s...
Nov 23, 2020
It'€™s been a long time since I performed Karnaugh map minimizations by hand. As a result, on my first pass, I missed a couple of obvious optimizations....
Nov 23, 2020
Readers of the Samtec blog know we are always talking about next-gen speed. Current channels rates are running at 56 Gbps PAM4. However, system designers are starting to look at 112 Gbps PAM4 data rates. Intuition would say that bleeding edge data rates like 112 Gbps PAM4 onl...
Nov 20, 2020
[From the last episode: We looked at neuromorphic machine learning, which is intended to act more like the brain does.] Our last topic to cover on learning (ML) is about training. We talked about supervised learning, which means we'€™re training a model based on a bunch of ...

featured video

Available DesignWare MIPI D-PHY IP for 22-nm Process

Sponsored by Synopsys

This video describes the advantages of Synopsys' MIPI D-PHY IP for 22-nm process, available in RX, TX, bidirectional mode, 2 and 4 lanes, operating at 10 Gbps. The IP is ideal for IoT, automotive, and AI Edge applications.

Click here for more information about DesignWare MIPI IP Solutions

featured paper

Reducing Radiated EMI

Sponsored by Maxim Integrated

This application note explains how to reduce the radiated EMI emission in the MAX38643 nanopower buck converter. It also explains the sources of EMI noise, and provides a few simple methods to reduce the radiated EMI and make the MAX38643 buck converter compliant to the CISPR32 standard Class B limit.

Click here to download the whitepaper

Featured Chalk Talk

Series 2 Product Security

Sponsored by Mouser Electronics and Silicon Labs

Side channel attacks such as differential power analysis (DPA) present a serious threat to our embedded designs. If we want to defend our systems from DPA and similar attacks, it is critical that we have a secure boot and root of trust. In this episode of Chalk Talk, Amelia Dalton chats with Gregory Guez from Silicon Labs about DPA, secure debug, and the EFR32 Series 2 Platform.

Click here for more information about Silicon Labs xGM210P Wireless Module Starter Kit