For decades, FPGA companies have been bragging about how their devices can be used to accelerate all sorts of compute-intensive functions, and they have built a large market for themselves convincing design teams that they can handle the toughest problems with aplomb – speeding up execution and reducing power demands.
“Hey!” said this one guy in the back.”How come you don’t accelerate your own design tools, then?”
This always started a bout of hand wringing and doubletalk and then a marketing person would step up and say some words about target market generality and solution portability and user platform preferences and therefore… What was the question again?
Now, FPGAs are a big deal for accelerating AI inference, and we all know that AI is useful for… just about everything, right?
Uh, oh. Here comes “guy in the back” again. “Uh, if AI is so great, why don’t you use it to make your design tools work better?”
“Hey,” said Xilinx. “Let’s do just that!” (Editor’s note: These may not have been their actual words.)
Nonetheless, Xilinx has now announced Vivado ML Editions, and it looks like they’ve achieved some pretty impressive results using machine learning to improve their FPGA design tools. The tl;dr of this is that Xilinx has used the magic of AI along with some not-quite-magic hierarchical design flow optimization to improve quality of results (QoR for those in the know) in the range of 10%-50% over a wide range of design styles. And, at the same time, they have achieved a “5x-17x compile time reduction.” (Yeah, we know. “5x reduction” isn’t a thing. What they really mean to say is an 80% to 94% reduction in compile time).
So, those numbers sound like a big deal, because they are. QoR gains are historically hard fought. Engineers working on algorithms for tasks such as synthesis and place-and-route are overjoyed if they can make an algorithm change that will improve results by an average of 1-2%, and even then they will typically have to show that over a large range of design types, with some designs improving and some getting worse, but an average improvement. It appears that Xilinx’s AI additions only make things better.
Compiling FPGA designs, even with the best tools, is somewhat of an art. Expert users of design tools can wave their magic wands, adjust optimization options and constraints, spend enormous amounts of CPU time, and achieve results that are significantly better than what the average user will get firing up the tool and pressing the big green button.
Part of what Xilinx is doing is throwing AI models in as the new expert. They have trained their tools over an enormous number of designs, and the AI can come much closer than in the past to achieving optimal settings on the first try – ergo – better results faster. Different parts of the design flow benefit best from different ML approaches, so Xilinx has thrown a wide gamut of trained models at the problem, with impressive results.
Xiinx says average design size has been growing at a whopping 38% CAGR, and with the current devices packing something like 92 billion transistors, that makes the EDA problem rapidly approach “untenable.” Legacy tools that were already struggling to keep up with the trends needed some help fast, and throwing a little AI at the problem may be just the right step at the right time.
Looking at the timing closure problem, for example, Xilinx has created what they call “Intelligent Design Runs” or ‘“IDR,” which use ML-based congestion estimation to see where the hot spots will be for routing, predicting the router behavior, and working to avoid congestion, ML-based delay estimation to find the timing-closure hot spots with better delay predictions for complex routing, and ML-based strategy prediction, which picks from 60+ custom strategies based on 100,000+ designs worth of training data. IDR can dramatically reduce the number of design iterations required to achieve timing closure, taking a big bite out of an unpredictable chunk of the project schedule.
The key here is that there are definite patterns in electronic designs, particularly with the re-use of large IP blocks and with hierarchical design techniques. This plays well into the hands of AI training, taking advantage of the complex pattern recognition abilities of AI/ML to choose strategies that have succeeded in similar designs.
Of course modern designs aren’t made in a one-shot vacuum. Xilinx has taken advantage of the incremental, hierarchical, team-based nature of most projects to grab another huge helping of performance with hierarchical module-based compilation and what they are calling the “Abstract Shell.” In hierarchical module-based compilation, the user defines reconfigurable modules (RMs), and the tool flow will compile and close timing only on what’s contained in a defined RM. This hierarchical compilation approach gives a significant speedup as engineers can work in parallel on different parts of the design, and it also provides secure handoff, as the active design files do not contain the rest of the design. The Abstract Shell decouples the compilation of the platform’s interfaces from the processing kernels. This also allows design teams to easily create and re-use their own dynamic IP, which works to improve collaborative design by promoting a team-based design methodology, allowing for a divide-and-conquer strategy and enabling multisite cooperation for large designs.
This block-based approach extends to dynamic reconfiguration using what Xilinx calls Dynamic Function eXchange (DFX) which saves silicon real-estate by allowing custom hardware accelerators to be dynamically loaded and swapped at runtime. DFX accelerates complex applications and facilitates remote hardware upgrades and over-the-air updates while the device is in operation.
These upgrades to Vivado are a welcome improvement in the face of the continual rapid growth of FPGA design and complexity and should be just a first round showing the potential of applying AI/ML techniques to FPGA EDA. One could see a future where FPGAs are used to accelerate the AI/ML portion of the FPGA design flow, finally completing the circle.