feature article
Subscribe to EE Journal Daily Newsletter

Flex Logix Fires Second Salvo

Challenging FPGAs on AI Applications

For decades now, FPGA companies have struggled to overcome their de-facto positioning as “ASIC alternatives.” Of course, FPGAs are great for prototyping your design, or for getting something into production much earlier than we could with a custom chip design. But, eventually, for designs that go into volume production, there comes a time when it’s worth designing an ASIC or ASSP to do the same thing, yielding better performance, lower power consumption, smaller area, and lower unit cost. This is bad for FPGA companies because just when a design win should turn into higher volume and long-term revenue, the FPGA gets dropped off the board and replaced with a custom device.

For FPGA providers, the solution for this sad situation involves finding applications where reprogrammability itself is a core requirement. Software-defined radio, for example, is such a “killer app.” The application requires programmable fabric so modems can be created, loaded, and dispatched on the fly. It doesn’t make sense to replace FPGA-based logic with hardened gates, because in-system reprogrammability is a fundamental part of the application. The result? The FPGA company gets to keep the socket even when production volume rises, without fear of being yanked in favor of a custom chip.

Recently, neural network inferencing has emerged as another of these FPGA “killer apps.” FPGA LUT fabric (along with the fixed-point DSP resources) delivers spectacular performance/power characteristics on neural network inferencing, and in-system reprogrammability is a must. On top of that, there is an enormous number of applications that can take advantage of AI/neural network technology. For FPGA companies, it could be the proverbial bird’s nest on the ground – a high-volume, high-value key role in a wide variety of new applications. You can almost hear the champagne corks popping at FPGA headquarters…

“Not quite so fast, there, cowboys.” says Flex Logix.

As we’ve discussed before, Flex Logix provides IP that allows designers to put FPGA LUT fabric on custom chips. Recently, they announced that their latest-generation EFLX cores allow embedded FPGA arrays up to 122.5K LUTs to be built on TSMC 16FF+ and 16FFC processes. This means that you can bring the benefits of FPGA-like reprogrammability to custom chips for applications such as neural network inferencing, for example. Flex Logix fabric comes in 2.5K LUT blocks, which can be arrayed to build the desired size – from 2.5K up to 122.5K.

What does this mean for the FPGA companies looking for that long-term socket? It means that programmability is no longer a competitive moat against ASIC/ASSP incursion. Design teams can build custom chips with all the benefits of reprogrammability rather than committing to off-the-shelf FPGAs for long-term, high-volume production. This doesn’t cut conventional FPGAs out of the picture, but it does put a damper on their aspirations to become the long-term, unchallenged solution for applications that want to cost-reduce for high volumes.

This is the second generation of the Flex Logix IP cores. These blocks are based on 6-input LUTs, which can also be configured as dual 5-input LUTs. These are similar to the logic cells used by mainstream FPGAs. Each block can be either a “logic” block or a “DSP” block, where the DSP version replaces some LUTs with 40 22x22bit MACs. Each IP block uses CMOS I/Os to talk to the rest of the chip, so you gain considerable bandwidth and power efficiency compared with using a separate, stand-alone FPGA along with your custom device. Flex Logix uses a proprietary, high-density routing architecture, which they claim has been further improved with this second generation, giving a very respectable 1.0 mm2 footprint with only six routing layers for a 2.5K LUT block on TSMC 16FF+/16FFC.

Flex Logix also says the new architecture has further improved on its novel interconnect to give higher performance for larger arrays. They have also structured the MAC blocks to be pipelined 10 in a row, which allows local interconnect to be used for highly chained datapaths such as FIR filters, thus improving performance and reducing the requirement for external routing resources. A new test mode has been added for faster test times, as well as additional miscellaneous DFT enhancements. For high-reliability (particularly aerospace) designs, a “readback” feature has been added that allows the configuration to be checked and scrubbed periodically in case radiation-induced single-event upsets or other environmental “soft” errors have damaged the configuration, allowing a quick reconfigure when an error is detected.

One of the key barriers to embedded FPGAs has always been design tools for the FPGA fabric. While plopping down a bunch of LUTs is a fairly straightforward task, providing and supporting the complex set of design tools for synthesis, place-and-route, and bitstream generation is a much more demanding undertaking. The Flex Logix EFLX compiler addresses this need, and it is available for a no-cost evaluation. Speaking of evaluation, the new cores are being fabricated now, and evaluation boards will be available under NDA to customers. Flex Logix proves all new cores in silicon themselves to assure that customer experience with integration is smooth.

The key question is, will Flex Logix get traction with the new cores in strategic applications like AI? They’re off to a good start. This week, the company announced that its embedded array is part of a next-generation deep learning chip being developed at Harvard by the research group of Professors David Brooks and Gu-Yeon Wei at Harvard’s John A. Paulson School of Engineering and Applied Sciences. The device has already gone to tape-out and is going into fabrication – giving an early view into the practicality of EFLX integration for AI applications.

Flex Logix is only a couple years old, but the company has already taken the embedded FPGA idea farther than any previous attempt we are aware of. Their tile-based structure gives chip designers a lot of flexibility in balancing the FPGA fabric with the other resources on the chip for their particular application. We have not yet used or talked to customers who use the tool suite, which is likely the critical make-or-break factor for widespread success of the technology. It will be interesting to watch.

Leave a Reply

featured blogs
Oct 17, 2017
Diwali is finally here! One of India’s most favorite festivals, it is celebrated across the world not just by Hindus but by many faiths as the Festival of Light, a time to meet friends and family, buy new clothes, and indulge in rich and deliciously artery-clogging sweet...
Oct 17, 2017
Everyone loves a good book. From Shakespearean classics to more modern-day masterpieces, a solid story can leave a lasting impression. While the VITA family of standards isn’t necessarily a book, it does have a rich history with a large impact.  As we learned in a prev...
Sep 12, 2017
Torrents of packets will cascade into the data center: endless streams of data from the Internet of Things (IoT), massive flows of cellular network traffic into virtualized network functions, bursts of input to Web applications. And hidden in the cascades, far darker bits try...
Sep 29, 2017
Our existing customers ask us some pretty big questions: “How can this technology implement a step-change in my specific process? How can Speedcore IP be integrated in my SoC? How can you increase the performance of my ASIC?” We revel in answering such questions. Ho...