feature article
Subscribe Now

Flexras Makes a Finer Cut

FPGA Partitioning for the Modern Era

If you’ve worked with large designs that need to be partitioned into multiple FPGAs, you’ve probably often thought how awesome automatic partitioning would be. You just throw your big’ol design at a fancy EDA tool, push the big green “GO” button, and BAM! Your whole design is sliced up into pieces – just like in one of those martial arts movies where the ninja slices the bad guy into about seven pieces so cleanly that he doesn’t even start to fall apart right away.

Your design would be cleanly ninja-sliced into perfect partitions that fit easily into your target FPGAs with the minimal number of inter-FPGA connections. You’d have no timing problems whatsoever, and you’d barely notice that your design wasn’t running on one big super-FPGA. Absolutely no manual intervention was required.

Then, of course, you woke up from that dream. 

You may have tried some of the more infamous automatic partitioning tools, which, by all accounts, for many years, were pretty useless for any real work. They’d cut your design up all right, but getting a partitioning job that you could actually work with was somewhere between a big chore and impossible. The partitions had to fit in all of the target devices, synthesize and place-and-route correctly on their own, map correctly to the pinout of the prototyping board you were using, meet your timing constraints for whatever performance you were trying to get your prototype to achieve, fit within the clock domain limitations of each target device, and still provide some facsimile of your original design so you could have a clue where to begin when you found a bug. After all, that was the reason you were building the prototype in the first place. 

It usually ended up being a lot easier to simply partition your design manually. You’d start with your big IP blocks and put the ones that didn’t communicate much with each other into different partitions. You’d clump together the subsystems that seemed to have a lot of natural affinity. You’d end up fudging a little bit to fill leftover space in some of the FPGAs and to avoid overloading others. There was no rigorous method for manual partitioning, but once you got the hang of it, things just seemed to intuitively fall into place. 

But, as your designs got bigger and bigger, that intuition wasn’t quite so strong anymore. The number of clocks went up, the number of IP blocks that didn’t like to participate in your VHDL vivisection increased, and the number of brain cells available to hold the whole thing didn’t rise a single bit. With each passing node of Moore’s Law and each step up the integration tree, the task of manually partitioning a huge (and getting huger) design into multiple FPGAs in order to build a prototype got more complex.

Still, there wasn’t much attention given to automatic design partitioning. That dog already had his day, right? Like many technologies that come out of the gate with a lot of hype and fanfare – automatic FPGA partitioning had gotten a bit of a black eye. Years of folklore haven’t done a lot to heal that wound, so you just don’t hear too much about new automatic FPGA partitioners these days.

Until now.

Flexras (if you haven’t already Googled them) is a French company. As you might expect, those crazy French don’t know any better than to go off solving problems that the rest of the industry has basically given up on. As a result, at the Design Automation Conference last week – they were telling everyone about their new Wasga Compiler – the “only-and-first timing-driven partitioning tool for SoC rapid prototyping.” The company says Wasga is fast, has huge capacity, and generates high-performance partitioned designs.

Making the partitioning process timing-driven tackles one of the biggest issues with legacy partitioners. Managing timing across partitions manually is a nightmare, and trying to use other partitioning metrics with the hopes that timing will come out OK is purely betting on luck. Usually, our luck is not so good. Flexras has a number of algorithmic innovations that they claim make their tool far faster than previous-generation partitioners. Faster generally leads to more capacity (and Flexras has some impressive size design examples) and having the whole thing timing-driven means you are much less likely to end up with an incomprehensible rat’s nest of wires with negative slack that nobody can debug.

Flexras says that their partitioner works with both Xilinx and Altera FPGAs and with any commercially available or custom-designed prototyping board. The company says that the Wasga Compiler can handle designs over a billion gates equivalent. It comes with a GUI to help you set up your project (configuration of prototyping board, etc). It allows automatic or manual placement and routing, and it takes advantage of proprietary high-speed multiplexing IP to speed up inter-FPGA communication.

Timing constraints are provided via SDC, and the tool also provides a system-level static timing analysis. You can also set up the tool to automatically run and control your back-end flow, and it can automatically handle iterative runs and verification of the results. The source design can be RTL or gate level, or a combination of both. The partitioning process itself can be run automatically or step by step with manual intervention. The overall flow is incremental, so you can record scripts to reproduce sequences of steps that are working to assure consistent results while making small changes. This addresses one of the biggest issues with older partitioners, where a small change could propagate massive changes in the design – resulting in severe and widespread timing problems. With the Flexras approach, only the changed modules are re-synthesized during each iteration – preserving the timing behavior of finished blocks.

Wasga does a global placement, and it tries to preserve the original design hierarchy as much as possible. This means that you’ll probably see some semblance of your original design in your multi-FPGA prototype – a convenience that was not always a given with other automatic partitioners. A global router then analyzes inter-FPGA connections, working to avoid partitioning-induced timing issues before they occur. The inter-FPGA timing is achieved by constraining the FPGA tools that are operating on each individual FPGA to conform to a global timing budget established by the partitioner.

Wasga technology is likely to end up being OEMed by selected prototyping board vendors, as well as being sold stand-alone. We also wouldn’t be surprised to see Wasga algorithms hiding inside some of your favorite emulators (for those of you with the budget to actually have a favorite emulator.)

It’s exciting to see this long-quiet segment of multi-FPGA design getting some new life, and with the incredible capability of some of today’s FPGAs, we may soon see some truly massive systems prototyped using Wasga. It will be fun to watch!

One thought on “Flexras Makes a Finer Cut”

  1. Flexras should bring some new energy into the automatic partitioning market. It’s about time, too. With the size of today’s FPGAs and the immense designs people are prototyping, manual partitioning is getting to be quite a pain. Would you try automatic partitioning?

Leave a Reply

featured blogs
Mar 5, 2021
The combination of the figure and the moving sky in this diorama -- accompanied by the music -- is really rather tasty. Our cats and I could watch this for hours....
Mar 5, 2021
In February, we continued to build out the content on the website, released a new hierarchy for RF products, and added ways to find Samtec “Reserve” products. Here are the major web updates to Samtec.com for February 2021. Edge Card Content Page Samtec offers a fu...
Mar 5, 2021
Massive machine type communications (mMTC) along with enhanced Mobile Broadband (eMBB) and Ultra Reliable Low Latency Communications (URLLC) represent the three pillars of the 5G initiative defined... [[ Click on the title to access the full blog on the Cadence Community sit...
Mar 5, 2021
Explore what's next in automotive sensors, such as the roles of edge computing & sensor fusion and impact of sensor degradation & software lifecycle management. The post How Sensor Fusion Technology Is Driving Autonomous Cars appeared first on From Silicon To Softw...

featured paper

Using the DS28E18, The Basics

Sponsored by Maxim Integrated

This application note goes over the basics of using the DS28E18 1-Wire® to I2C/SPI Bridge with Command Sequencer and discusses the steps to get it up and running quickly. It then shows how to use the device with two different devices. The first device is an I2C humidity/temperature sensor and the second one is an SPI temperature sensor device. It concludes with detailed logs of each command.

Click here to download the whitepaper

Featured Chalk Talk

0 to 112 (Gbps PAM4) in 5 Seconds

Sponsored by Samtec

With serial connections hitting 112Gbps, we can’t mess around with our interconnect. We need engineered solutions that will keep those eyes open and deliver the signal integrity we need in our high-speed designs. In this episode of Chalk Talk, Amelia Dalton talks with Matt Burns of Samtec about the interconnect options for speeds up to 112Gbs, and Samtec’s Flyover interconnect technology.

Click here to download the Silicon-to-Silicon Solutions Guide