feature article
Subscribe Now

Pumping up Premier

Synplicity Boosts Flagship Tool

With excellent tools available almost for free from FPGA companies, you might wonder why top notch design teams still pay for high-end FPGA tools from companies like Synplicity.  This week, Synplicity helped us out with that question with new improvements to their top-of-the-line synthesis offering – Synplify Premier.  Premier is now updated with new capabilities, and — probably more important – support for the industry’s latest, biggest, baddest FPGAs, including Xilinx’s Virtex-5 family, and beta support for Altera’s Stratix III, Stratix IIGX, and Stratix II families.

Synplicity’s Synplify Premier has a technology called “graph-based physical synthesis” that is designed to help you reach timing closure faster – or on more complex designs, to help you get there at all.  As we’ve discussed before, delay in FPGAs has changed over the past several years and process generations.  Back at the larger geometries, most of the delay was in the logic elements themselves.  Because of that, logic synthesis could make a pretty good guess at delay just by adding up the known delays of all the logic elements in a path.  Now, however, the majority of the delay comes from the routing paths between the logic elements.

If you’re, say, a logic synthesis tool, this makes it very difficult to predict path delays and correspondingly difficult to generate the right logic modifications to make those delays fit nicely between both all the registers and the corresponding clock edges.  The routing delays, you see, aren’t known until after place-and-route has a go at the synthesized netlist.  From the synthesis tool’s point of view, that’s way too late. 

Physical synthesis puts the horse before the cart – or at least even with it.  Placement and logic synthesis are done together, so the delays can be estimated very accurately.  Using that information, the tool has maximum latitude to change the design so that it meets timing.  At a simple level, critical paths can be placed so the routing runs will be shorter or so that higher-performance routing resources can be used.  Getting fancier – logic can be replicated, re-structured, or replaced with faster implementations, based on full knowledge of the timing picture. 

Without layout information, timing estimates are only within 20-30% of the eventual numbers – not close enough to make critical decisions like which paths need logic restructuring.  Also, replication is most effective when based on placement – putting drivers in close proximity to the loads they’ll be driving.  Without accurate placement information, replication is a shot in the dark on improving timing, but the size (and sometimes power consumption) of the design are increased.  Also, without accurate timing information, the synthesis tool will often optimize the wrong logic paths, putting unnecessary changes into paths that would have been OK while ignoring paths that will actually have a real problem.

The synthesis and place-and-route problem is truly a technology issue that stems from the organizational chart.  Back in the early 1980s, the engineers working on placement and routing algorithms sat together, maybe in a different building from the ones working on logic optimization and synthesis.  The result?  The logic synthesis and place-and-route parts of the process were implemented as different programs, with different internal data structures, connected only by a structural netlist file.  With the advent of physical synthesis tools, we’re trying to retroactively re-unite those domains.  Synthesis and place-and-route engineers now even attend some of the same barbecues.

Physical synthesis technology originated in the ASIC world and migrated to FPGAs.  As a result, some idio-ASICracies followed it over.  For example, in ASIC design, routing varies pretty much linearly with Manhattan distance.  If you know the X and Y delta between two ends of a route, you can make a pretty good guess at the routing delay.  In FPGAs, however, routing delays are anything but distance-linear.  We have “Manhattan subway” routes with super-low impedance that zip long distances across the chip – (in one straight line), and we have “slow” routes that meander around piling on the Z in pretty short order.  The net result is that distance is a very poor indicator of delay.  This is the reason for the “graph-based” designation added by Synplicity. 

Graph-based physical synthesis doesn’t use simple distance as the currency for estimating routing delay.  A graph is constructed based on the actual speed of the routing resources available, and the tool does a placement and a global route in order to understand exactly what the physical delay attributes of a logic path will be.  The detailed placement information is passed on to the FPGA vendor’s place-and-route software, and the routing is left to a (small) leap of faith (that the vendor’s tool will perform detailed routing the way the physical synthesis tool predicted.)  In practice, it usually is right.  “About 90% of the routes are accurate within plus or minus 10%,” says Jeff Garrison, director of marketing for implementation products at Synplicity.  “The global route assures that routing resources exist for the placement we’re generating, and that gives us much more accurate timing estimates.”

In practice, physical synthesis reduces the number of iterations required to meet timing closure, thus reducing the design cycle considerably.  Some iterations, however, are the result of actual design changes rather than trying to clip those last few nanoseconds from a pesky path.  Making changes to a design is another historic problem for the interaction between synthesis and place-and-route.  In the past, making one tiny tweak to your HDL meant re-synthesizing your entire design from scratch.  It was entirely possible (likely, in fact) that the logic synthesis tool would generate a completely different netlist based on your small change, and then the place-and-route tool would make a completely different placement and… guess what?  Yep.  All those timing paths you worked so hard to close are now gone and replaced with a whole new crop.  Welcome to design iteration hell.

As Bryon Moyer discussed in his incrementality article last week, incremental design solutions are now available that address this issue – minimizing the change in the result from small changes in the source.  “Synplify Premier addresses the two main reasons that people do incremental design,” says Angela Sutton, Senior Product Marketing Manager, FPGA Implementation at Synplicity. “It offers a quick debug iteration to give you very accurate timing without having to run place-and-route, and it allows you to make small changes to your design without having the results change dramatically.  We use path groups – automatic partitions that can be separately optimized — to automatically localize changes to regions impacted by your RTL change.”

The new release of Synplify Premier (9.0) also includes additional enhancements such as a new user interface and additional SystemVerilog constructs.  It also includes the usual litany of unspecified performance and quality-of-results improvements we expect with any new rev of a synthesis tool. 

We asked about the obvious delay in the release of support for physical synthesis of 65nm devices compared with what we’re used to seeing (immediate support on announcement).  Garrison explained, “For this release, we were still getting the general algorithms of the tool up to speed and adding first-time support for many devices.  In the future, we expect to have support for new device families much closer to announcement.”  Also remember that many device families announced by FPGA vendors over a year ago are not yet (or just now) going into full production.  The time from announcement to volume use is still quite long in the FPGA world.

With significant investment by third-party EDA companies like Synplicity in technologies like graph-based physical synthesis, it’s likely that we’ll continue to see a large number of design teams opt for the vendor-independent third-party tools approach.  The same economy of scale and concentration of expertise that made EDA a viable industry for ASIC design apparently still applies for FPGAs as well.

Leave a Reply

featured blogs
Aug 19, 2018
Consumer demand for advanced driver assistance and infotainment features are on the rise, opening up a new market for advanced Automotive systems. Automotive Ethernet allows to support more complex computing needs with the use of an Ethernet-based network for connections betw...
Aug 18, 2018
Once upon a time, the Santa Clara Valley was called the Valley of Heart'€™s Delight; the main industry was growing prunes; and there were orchards filled with apricot and cherry trees all over the place. Then in 1955, a future Nobel Prize winner named William Shockley moved...
Aug 17, 2018
Samtec’s growing portfolio of high-performance Silicon-to-Silicon'„¢ Applications Solutions answer the design challenges of routing 56 Gbps signals through a system. However, finding the ideal solution in a single-click probably is an obstacle. Samtec last updated the...
Aug 16, 2018
All of the little details were squared up when the check-plots came out for "final" review. Those same preliminary files were shared with the fab and assembly units and, of course, the vendors have c...
Jul 30, 2018
As discussed in part 1 of this blog post, each instance of an Achronix Speedcore eFPGA in your ASIC or SoC design must be configured after the system powers up because Speedcore eFPGAs employ nonvolatile SRAM technology to store its configuration bits. The time required to pr...