feature article
Subscribe Now

Physical Synthesis Flows for FPGA Designs


Most FPGA designs today rely on an HDL based description of their design. HDL synthesis is probably the single most important software flow step when it comes to defining the performance of a design.  Synthesis links the conceptual description of the logic functions needed for the design to their actual physical architecture elements in the underlying device.  This step cannot be underestimated.  Synthesis is performed prior to chip placement as an entirely separate step, hence these technology dependent optimizations are computed without knowledge of actual chip placement.  As a result, design performance can be far from optimal, impacted by choices made too early.  This is where physical synthesis comes into play, bringing physical information to the synthesis engine.

Traditional Flow versus Physical Synthesis Flow:

The most common design flows use synthesis and place & route as two consecutive disjointed steps.  Synthesis generates an EDIF netlist that is then passed on to the backend for implementation.  The netlist contains basic elements such as LUTs, flip-flops, etc., but does not control how these elements will be packaged together in the FPGA clusters (referred to as “slices” in Xilinx® FPGAs) during the packing phase.  Synthesis also has no control on placement and often does not have access to the entire design, if cores are used as black boxes.

With physical synthesis, it’s different. Physical synthesis yields a better result because it provides information about the actual critical paths, the ones that placement is actually seeing. This is a key feature as it closes the loop between synthesis and place & route.

Figure 1 compares the two flows. The traditional flow is shown on the left and the physical synthesis flow using Xilinx® ISE™ 9.1i is shown on the right.  All options in blue are explained in detail in the next section.


Fig 1: Traditional flow and ISE 9.1i physical synthesis flow

Another key advantage of physical synthesis is that it guarantees a better level of consistency for both the synthesis and implementation constraints. By having an integrated environment for synthesis, packing and placement, it guarantees that synthesis and place & route are working on the same problem.

An important silicon architecture consideration: The trend in modern FPGA silicon architecture is to offer more and more capable clusters (or slices).  This permits more possibilities for physical synthesis flows since the traditional ISE software flow places already pre-packed slices.  In effect, the traditional flow does not place LUTs and flip-flops, it actually places slices. The –timing option in ISE software enables placement at the most basic element level (non only LUTs and flip-flops but also logic fragments in the slice like dedicated arithmetic and multiplexer circuitry).

To respond to the challenge of physically aware synthesis, several approaches exist today with tools that enable synthesis optimizations aware of placement and capable of modifying technology mapping, altering clustering (packing) and enhancing placement, using information from the initial placement.

The following paragraphs provide an overview of solutions provided by Xilinx ISE 9.1i software and Synplicity® Synplify® Premier.

Physical Synthesis Optimizations in ISE Software:

ISE 9.1i software provides several physical synthesis options to improve results beyond the default compiles. These optimizations are applied on the same base netlist used in the traditional, non-physical flow.  This enables ISE software to use any incoming netlist without having to rely on a particular synthesis tool. Users can also use Xilinx synthesis tool (XST) as a design entry tool for this flow.

All options are part of the MAP step of the ISE implementation flow.  To enable the flow, the following options are used:

MAP command line option Description
-global_opt on|off Optimization routines that operate on the fully assembled netlist after initial packing. These optimizations include logic remapping and trimming, logic and register replication.  This option can optimize black-boxed portions of the design.
-logic_opt on|off Post-placement logic restructuring.
Operates on a placed netlist to optimize timing critical connections through restructuring and re-synthesis, followed by incremental placement and incremental timing analysis. Option is enabled in conjunction with “–timing.”
-register_duplication on|off The option is only available when running timing-driven packing and placement with the –timing option. The option duplicates registers to improve timing when running timing-driven packing (“–timing”).
-retiming on|off When this option is on, registers are moved through the logic to balance out the delays in a timing path to increase the overall clock frequency. By default, this option is off.  It requires global_opt “on” to operate.
(Note: this option is always active for Virtex-5 FPGAs)
Enables packing and placement interaction based on timing goals. When activated, placement is done during the MAP phase, therefore the –ol option should be used along with it.


Table 1: MAP Physical Synthesis properties description

Alternatively, all these options (shown in red in the picture on the right) are accessible from Project Navigator, the main ISE GUI via the “Process Properties” window. Property display level must be set to “Advanced.”

Figure  2: MAP properties for Physical Synthesis

The effectiveness of the options discussed above depends on a number of factors. These MAP options will have more opportunities to make improvements in the following situations:

  • Under-constraining in synthesis prevents it from generating the best optimizations. To avoid this situation, it is recommended to tightly constrain synthesis until the tool reports negative slack.
  • Inconsistent constraining between synthesis and implementation is a fairly common situation in which synthesis is not driven to optimize paths that are later constrained during implementation.  Physical synthesis can likely re-build the fast logic needed to meet timing. To remedy this situation in the traditional flow, carefully examine constraints between synthesis and implementation and make sure similar paths are covered in both.
  • In a bottom-up or partition flow, synthesis may not optimize between blocks or partitions.
  • Design reuse netlist used as “black-boxes” in synthesis may limit the amount of possible optimization.  Note that synthesis has the capability in the traditional flow to “read” netlists from black-boxes.  This helps the tools analyze paths going to and coming from the black boxes. But sometimes these black boxes are not added to the synthesis project and this is where physical synthesis options can have a great impact.
  • Designs with high LUT to flip-flop ratio (few registers for a lot of logic) are more likely to benefit from the retiming option.  Note that retiming (called register balancing in XST) is also available in synthesis and can be used as part of the traditional flow.

Even if care is taken during the synthesis step and constraints are consistent between synthesis and implementation, physical synthesis can improve performance.  Following are some of the optimizations used in the algorithms:

  • Logic Duplication: If a LUT or flip-flop drives multiple loads, and the placement of one or more of those loads is too far away from the source to meet timing requirements, the LUT or flip-flop can be replicated and placed close to that group of loads, thus reducing routing delays.
  • Logic Recombination: If the critical path traverses through multiple LUTs and through multiple slices, the logic can be reassembled utilizing fewer slices by using a more timing efficient combination of LUTs and MUXes to reduce the routing resources needed for that path.
  • Basic Element Switching: If a function is built with LUTs and MUXes within a slice, physical synthesis and optimization can rearrange the function to give the fastest path (usually through the MUX select pin) to the most critical signal as shown in Figure 3:
“sig” is timing critical, it crosses a LUT and a MUX…
“sig” has been repositioned and does not pass through a LUT anymore…

Figure  3: Basic Element Switching Example

  • Pin Swapping: Each input pin of a LUT may have a different delay.  MAP has the ability to swap pins (and change the LUT equation accordingly) so that the most critical signal is assigned to the fastest pin.  This is particularly effective with the Xilinx® Virtex™-5 FPGAs since its 6-input LUTs have distributed delays, with pins 1 through 6 being increasingly faster (pin 6 being the fastest). This pin swapping capability in MAP helps predict timing with more accuracy. It should be noted that during routing, pins can also be swapped.  In the traditional flow only the routing phase will operate pin swapping.

In conclusion, Xilinx ISE 9.1i software provides several options to enable physical optimizations in a one pass flow. Choosing the right one (or the right ones) can prove to be difficult.  To make it easier, Xilinx provides the Xplorer utility to run the design with these optimizations and to select the best one. The Xplorer utility is available at the command line and also from the GUI with Project Navigator.

Physical Synthesis with Synplify Premier:

Synplicity offers a physical synthesis tool known as Synplify Premier. The Synplify Premier product is a graph-based physical synthesis tool that enables single-pass physical synthesis. The essence of the graph-based approach is that pre-existing wires, switches and placement sites used for routing an FPGA are represented as a detailed routing resource graph. The notion of what is a “good routing choice” then changes from delay estimation only to a measure of actual delay and availability of interconnect wires. 

Synplify Premier merges optimization, packing, placement and routing to ensure available, fast routes along critical paths and generates a fully placed and physically optimized netlist as output ready for final routing in ISE software. The main benefits of this approach are the output from synthesis is routable and timing is known after synthesis because it correlates with the timing that the user will see after ISE routes the design. This approach reduces the number of synthesis runs (ISE backend iterations) involved in meeting timing goals.

Synplify Premier provides an encapsulated flow which enables the completion of a physical synthesis design without leaving the Synplify Premier graphical interface. After entering all the design files including black boxes and carefully setting up the constraints, Synplify Premier performs the steps necessary to deliver a physically optimized design:

  • The tool performs an initial synthesis (or compile) and runs the ISE software flow through placement to initialize its optimizations. See figure 4 below.
  • Synplify Premier will then read back the results to evaluate critical paths with much better accuracy compared to the traditional synthesis flow.
  • Based on this first placement, Synplify Premier keeps the I/O placement and performs a global full-chip placement.
  • Synplify Premier also performs detailed placement taking into account very specific routing characteristics and resources of the target FPGA. As explained earlier, Synplify Premier integrates the fact that proximity alone in placement does not always lead to optimal performance because routing timing delays are not always dependant on distance alone. To account for the timing differences for the various routing structures, Synplify Premier uses the graph of pre-existing wire availability when doing placement.
  • At the end of the process, Synplify Premier generates a netlist, a legal, routable placement plus a constraint file (.ncf) and then spawns the Xilinx Xflow command to finalize the design routing.  Xflow will check the packing, placement and will route the circuit based on the forwarded constraint file.

Figure 4: Synplify Premier Flow


Physical synthesis enables better results by bridging synthesis and place & route.  Xilinx provides the technology as part of ISE 9.1i using re-synthesis algorithms that can be applied to any incoming netlists.

Synplify Premier from Synplicity provides a different implementation of this technology using its own full chip placement. An initial placement will considerably improve timing predictions due to highly accurate correlation between what Synplify Premier uses and the final post-route timing results.  It ultimately provides a routing-aware placement to the ISE software that meets timing after ISE software routes the design.  

Frédéric Rivoallon is manager of Systems Methodology in the Design Software Division at Xilinx, he oversees the development of software methodologies, benchmarking, and design optimization techniques for new FPGA architectures.

Rivoallon joined Xilinx in 1996. Prior to joining Xilinx, he held FPGA and ASIC design engineering positions with Thomson Multimedia.

Rivoallon holds a masters degree in electrical engineering from the Institut des Science Appliquées, Reinnes, France.

9 thoughts on “Physical Synthesis Flows for FPGA Designs”

  1. Pingback: DMPK Studies
  2. Pingback: free slots
  3. Pingback: next

Leave a Reply

featured blogs
Jun 6, 2023
On June 1, Cadence president and CEO Anirudh Devgan rang the Nasdaq Stock Market opening bell in New York City to celebrate our 35th anniversary and our many accomplishments. Here are a few thoughts from KT Moore, vice president of Corporate Marketing, on this significant mil...
Jun 2, 2023
I just heard something that really gave me pause for thought -- the fact that everyone experiences two forms of death (given a choice, I'd rather not experience even one)....
Jun 2, 2023
Explore the importance of big data analytics in the semiconductor manufacturing process, as chip designers pull insights from throughout the silicon lifecycle. The post Demanding Chip Complexity and Manufacturing Requirements Call for Data Analytics appeared first on New Hor...

featured video

Shift-left with Power Emulation Using Real Workloads

Sponsored by Synopsys

Increasing software content and larger chips are demanding pre-silicon power for real-life workloads. Synopsys profile, analyze, and signoff emulation power steps to identify and analyze interesting stimulus from seconds of silicon runtime are discussed.

Learn more about Synopsys’ Energy-Efficient SoCs Solutions

featured paper

EC Solver Tech Brief

Sponsored by Cadence Design Systems

The Cadence® Celsius™ EC Solver supports electronics system designers in managing the most challenging thermal/electronic cooling problems quickly and accurately. By utilizing a powerful computational engine and meshing technology, designers can model and analyze the fluid flow and heat transfer of even the most complex electronic system and ensure the electronic cooling system is reliable.

Click to read more

featured chalk talk

Battery Management System Overview
Sponsored by Infineon
Effective battery management for electric vehicles is a critical design element faced by engineers today. In this episode of Chalk Talk, Amelia Dalton chats with Marco Castellanos from Infineon about the key functions of battery management for electric vehicles, the role that cell balancing, voltage measurement and temperature measurement play in battery management ICs, and how wireless battery management using bluetooth low energy can help you tackle a variety of battery management challenges for your next design.
Jun 28, 2022