feature article
Subscribe Now

ISE Storm

Xilinx 9.1i Packs New Capabilities

For system designers, Moore’s Law is a gravy train.  Every couple of years, you get more gates, more speed, less power consumption, and lower cost.  For digital designers and tool developers, however, that gravy train is headed through the tunnel right at you.  Every couple of years, you have more gates to design in less time, more complexity to overcome, and tougher verification problems.  Your design tools are heavily impacted, too.  The old synthesis and place-and-route runs that took a few minutes on an old 200MHz Windows 98 laptop are now running for 24 hours on the latest multi-core, memory-laden, tricked-out machines. 

Xilinx’s latest software release goes straight at that problem, acknowledging that in this day of platform-based design, IP re-use, hardware/software verification, and high-speed serial I/O, the toughest FPGA design challenge for most people is still basic timing closure from RTL to bitstream.  Xilinx’s new ISE 9.1i includes two major enhancements: “SmartCompile,” to address timing closure on large designs, and some new power optimization capabilities to address the growing sensitivity to power consumption in today’s more FPGA-centric systems.

Xilinx tackled the runtime and productivity issue both in evolutionary progress on runtimes and algorithm efficiency (boosted by faster computing platforms, of course) and in more revolutionary change in the form of incremental design capability. 

Before tackling incrementality, Xilinx claims to have achieved a 2.5X average improvement in runtime.  Since we here in Journal land always hate “2.5X faster” as a way of talking about runtime improvements – here is what that means, according to our super-secret “execution speed” decoder ring:  the runtime would be divided by 2.5, giving a 60% runtime reduction.  Xilinx measures runtime over a suite of 100 “typical” customer designs on the same machine running the old and new versions of the software, then averages the deltas.  Voila!  60% runtime reduction on average – “2.5X faster”  (Ain’t marketing wonderful?)  Actually, a 60% reduction is monumental in software performance tuning… particularly on a product whose release number is in the 9.X range.  Normally, the easy speed gains are back in releases 1.x, 2.x etc. when you’ve got plenty of stupid n-squared loop issues to clean up.  Mature software has much less low-hanging fruit.

Since most design (particularly the timing closure phase) involves iterative running of steps like synthesis and place-and-route, efficient, intelligent incremental design tools can effect a dramatic improvement in average iteration time.  If you go into your design and change only one small section, you don’t want to wait around while all the other parts of your design are re-compiled exactly as they were before.  You’d like for just the new and changed sections to require recompilation.

All this incrementality sounds great in concept, of course.  It’s in the real-world implementation that problems crop up.  That’s where Xilinx has had to focus their energy in providing practical incremental design.  The classic difficulties in incremental compilation include things like sub-optimal timing results caused by modified parts of the design introducing new critical timing paths, some of which could benefit from a re-placement or re-synthesis of untouched design blocks.  Additionally, sometimes you have to rip up or move existing sections of a design to make way for the new, larger, or otherwise different modified sections.  Managing this squishy situation is one of the core challenges of incremental design tools.    Another challenge is overhead management.  Often, the compute and storage overhead required to provide incremental design capability can cause slowdowns and inefficiencies that eat up the speed gains that incrementality is intended to provide.

Xilinx claims to have addressed these issues in developing their new “SmartCompile” technology.  When you’ve already run your design once, you can make minor changes without requiring the software to do a complete re-implementation of the design from scratch.  Besides improving runtimes, this locks down the timing on parts of the design where you’ve already completed timing closure – the old non-incremental process could sometimes blow the results from one section of the design while processing changes in another.  Preserving the old results as much as possible between incremental iterations helps speed convergence.  Overall, Xilinx claims another “2.5X speedup” from incrementality on subsequent runs.  By our decoder-ring math again – that means that you might save an average of 84% runtime on an incremental run with the new release versus a full run with the old release.  Xilinx calls this a “6.25X faster” runtime.

Xilinx has also added a feature called “SmartPreview” that allows place-and-route to “pause” and “resume” – this allows you to view intermediate results without waiting for the whole run – a big time saver if you discover something that’s wrong early on instead of waiting for an overnighter to complete.  The SmartPreview allows you to create a bitstream to take into a part immediately for debug, preserve your latest results as a snapshot, abort the place-and-route process entirely, or move to the next run of a multi-pass place-and-route process. 

Finally, the new “SmartCompile” boasts a feature called “SmartGuide” that attempts to minimize the change between iterations, reducing the timing perturbation and runtimes for small design changes of the type usually encountered late in the design cycle.  SmartGuide is a pushbutton algorithm that compares the new and old versions of a design, uses the original design as a guide, and incrementally places the new or changed elements and critical elements.  It then identifies critical timing paths and incrementally routes new and critical paths to meet timing in order to reach a final implementation. Furthermore, you can manually identify partitions if you want to exercise more control, which is particularly useful for situations like team-based design where multiple engineers may be working on a single FPGA and be at different phases of their own implementation.

Xilinx has thrown a few more convenience and productivity features into the release, including a TcL console to allow scripting, the hooks necessary to integrate a variety of source code management systems into your design flow with ISE, and an expanded timing closure environment that brings together the various timing closure tools into one user interface.

The second major challenge tackled by the new ISE is power optimization.  Power in FPGA designs is only recently becoming a first-class concern.  Old FPGA users just took whatever power consumption they got, plugged in bigger power supplies and fans if needed, and took it all as an excuse for the occasional marshmallow roast over their development boards.  Today, however, many designers actually care how much power their FPGA design will burn.  FPGAs are becoming more central to the system, larger, and faster.  All of those factors make them higher on the most-scrutinized components list for suspicion of power mongering.

First on Xilinx’s list of chores was to improve the accuracy and timeliness of power estimation.  Many design projects have ended up far down the implementation trail only to discover that they were impossibly far over their power budget.  Early estimation is the only way to get confidence that you’re headed toward a workable solution from a power perspective.  Xilinx has included new power estimation spreadsheets into ISE that help you get a rough idea of the power picture early on.

Once you get into your design process, you want to cut power consumption as much as possible.  Xilinx has added new power optimization in both synthesis and place-and-route that they claim automatically reduces dynamic power by an average of 10%.  Power consumption is highly design- and stimulus-dependent, however, so don’t be surprised if you see a wide variation in your results.

9.1i is available immediately from Xilinx, and additional packages such as the ChipScope 9.1i version have already been announced.

Leave a Reply

featured blogs
May 16, 2021
https://youtu.be/_wup2MSTVks Made on Communication Hill, San Jose (camera Carey Guo) Monday: Intel eASIC: Linley and DARPA Tuesday: Please Excuse the Mesh: CFD and Pointwise Wednesday: Linley:... [[ Click on the title to access the full blog on the Cadence Community site. ]]...
May 13, 2021
Samtec will attend the PCI-SIG Virtual Developers Conference on Tuesday, May 25th through Wednesday, May 26th, 2021. This is a free event for the 800+ member companies that develop and bring to market new products utilizing PCI Express technology. Attendee Registration is sti...
May 13, 2021
Our new IC design tool, PrimeSim Continuum, enables the next generation of hyper-convergent IC designs. Learn more from eeNews, Electronic Design & EE Times. The post Synopsys Makes Headlines with PrimeSim Continuum, an Innovative Circuit Simulation Solution appeared fi...
May 13, 2021
By Calibre Design Staff Prior to the availability of extreme ultraviolet (EUV) lithography, multi-patterning provided… The post A SAMPle of what you need to know about SAMP technology appeared first on Design with Calibre....

featured video

What’s Hot: DesignWare Logic Library IP for TSMC N5

Sponsored by Synopsys

Designing for N5? Josefina Hobbs details the latest info and customer results on Logic Library IP for TSMC N5. Whether performance, power, area or routability are your key concerns, Synopsys Library IP helps you meet your toughest design challenges.

Click here for more information about DesignWare Foundation IP: Embedded Memories, Logic Libraries, GPIO & PVT Sensors

featured paper

E-book: An engineer’s guide to autonomous and collaborative industrial robots

Sponsored by Texas Instruments

As robots are becoming more commonplace in factories, it is important that they become more intelligent, autonomous, safer and efficient. All of this is enabled with precise motor control, advanced sensing technologies and processing at the edge, all with robust real-time communication. In our e-book, an engineer’s guide to industrial robots, we take an in-depth look at the key technologies used in various robotic applications.

Click to download e-book

Featured Chalk Talk

uPOL Technology

Sponsored by Mouser Electronics and TDK

Power modules are a superior solution for many system designs. Their small form factor, high efficiency, ease of design-in, and solid reliability make them a great solution in a wide range of applications. In this episode of Chalk Talk, Amelia Dalton chats with Tony Ochoa of TDK about the new uPOL family of power modules and how they can deliver the power in your next design.

Click here for more information about TDK FS1406 µPOL™ DC-DC Power Modules