feature article
Subscribe Now

Duct Tape, FPGAs, and the Art of Making Great Multi-Purpose Tools

Most engineers will agree that duct tape is an excellent multi-purpose tool.  This wonder product has been used for everything from giving tennis balls the feel of a cricket ball, to saving the Apollo 13 mission from certain disaster.  Engineers love good multi-purpose tools because of the sheer versatility that they offer; a good multi-purpose tool can help a creative engineer get themselves out of a real bind.

To hardware designers, FPGAs are also excellent multi-purpose tools.  No other “off-the-shelf” semiconductor can become so many different things to different people.  The super-versatile logic cell architecture of the typical FPGA allows it to be used for everything from image enhancement on the Mars rover, to a life-saving patient heart monitor.  But FPGAs have been changing.

When FPGA technology first emerged, the concept was pretty simple: build an array of general-purpose logic cells that can be programmed to produce any possible logic configuration. This approach worked well for simple designs but was limited in its ability to handle more complicated designs. For example, many designs require large amounts of memory, and using only these general-purpose cells to create memory arrays is very inefficient. The designer was forced to use off-chip memories when using FPGAs in a design with large memory requirements, increasing BOM cost and PCB footprint.  Programmable logic device vendors responded to these changing customer requirements by introducing the special resource stripes that appear in most modern FPGA architectures.  Putting a column of purpose-built RAMs among the logic columns made these programmable devices practical for a much larger set of designs, allowing memory needs to be met on-chip.  Likewise, customers complained about poor Fmax performance when attempting to synthesize their multiplication-heavy DSP designs to an FPGA.  Many architectures now include purpose-built multipliers or DSP blocks that, besides being area efficient, can operate at a much faster frequency than a corresponding circuit built with fabric logic.  Given the rapid adoption of the dedicated resource stripes in current device offerings, future devices are sure to contain even more dedicated resource IP.

One of the essential tools in any FPGA designer’s belt is their FPGA synthesis tool.  On the whole, the heuristics used by synthesis tools do a great job of managing the trade-offs inherent in resource allocation. The tool can look for a mapping to achieve the best area savings.  For example, mapping larger RAM constructs within a design to the available block RAMs, and smaller RAM structures into fabric. Alternatively, timing can also be given consideration. In certain cases, logic on the critical path of a design is best implemented with the programmable logic cells; in other situations, only the use of the dedicated resources will allow the design to meet performance goals.

But while synthesis tools can use heuristics to determine the best implementation for most designs, they will never have all of the knowledge the design engineer possesses. In certain cases it could be possible for the designer to obtain superior results by guiding the synthesis tool with a specific resource assignment, using these dedicated resources as multi-purpose tools.  The problem has been that until recently, these dedicated resource IP blocks have not been terribly good multi-purpose tools.  They have had great potential to become so, given a bit of creativity.  For instance, a RAM block can sometimes be used to implement a shift-register, a DSP block can implement a counter or even a multiplexer.  While these might not be the most ideal uses for a dedicated resource, at least, according to traditional optimization heuristics, they could be advantageous when you’re in a pinch.  But the analysis and mapping control required to elicit good multi-purpose use of dedicated resources was lacking—until now.

Recent advances in programmable logic synthesis technology make it possible for designers to actually use these dedicated resources as effective multi-purpose tools.  Here’s how it works: when you bring your design into the synthesis environment, part of the compile process is a step whereby the synthesis tool examines your design looking for arithmetic and datapath “operators”, such as multipliers, counters, multiplexers, shift registers, memories, etc.

Before performing actual synthesis to the target technology, the designer can use a type of resource manager to examine what dedicated resources are available on the target device, and which operators are recognized by the synthesis tool as mappable to those dedicated resources.  For instance, when targeting a device with on-chip RAMs and DSP blocks, you would be given two views—one of all available RAM blocks on the chip along with all operators that could map to those RAM blocks, and another of all available DSP blocks on the chip along with all operators that could map to those DSP blocks.

Next, the designer can browse the various operators listed for each resource type, cross-probing to either the HDL source code or an RTL schematic, to familiarize themselves with where each operator is situated within the overall design, what is the size of operation (e.g. 8-bit or 18-bit multiplier) and what is the clock period constraining that operator.  The designer can also see a predictive summary count of all the dedicated resources forecast to be used by the current resource assignment.  So for example if the synthesis tool’s auto-assignment would result in 8 DSP blocks and 2 RAM blocks being used, the designer would have up-front knowledge of this before actually performing a full synthesis of the design.  Having advanced knowledge of any potential resource scarcity can alert the designer that they may find advantage in reviewing all assignments to that scarce resource, to ensure that the most appropriate operators are allocated to the dedicated resources.

Once the designer has reviewed operator assignments, they can make specific assignments of some or all of their operators to a particular dedicated resource, proceeding to have the synthesis tool perform a heuristic-based auto-assignment on the remaining operators during synthesis of the design to the target technology.  Voila!  At long last you have easy control over making once “use-it-or-lose-it” IP into great multi-purpose tools.

Most synthesis tools do a great job of finding the optimal use of the available device resources in mapping a design.  While you might not need to use this sort of resource management every day, it is certainly a valuable tool to have in your arsenal.  The creative freedom given to hardware designers by this sort of capability is quite impressive.  Now if only someone could find a way to make an actual FPGA out of duct tape…

Patent-applied-for resource management technology in Precision RTL Plus identifies available architectural blocks and assists in re-mapping implementations for the best performance and device utilization.

Leave a Reply

featured blogs
Sep 26, 2022
Most engineers are of the view that all mesh generators use an underlying geometry that is discrete in nature, but in fact, Fidelity Pointwise can import and mesh both analytic and discrete geometry. Analytic geometry defines curves and surfaces with mathematical functions. T...
Sep 22, 2022
On Monday 26 September 2022, Earth and Jupiter will be only 365 million miles apart, which is around half of their worst-case separation....
Sep 22, 2022
Learn how to design safe and stylish interior and exterior automotive lighting systems with a look at important lighting categories and lighting design tools. The post How to Design Safe, Appealing, Functional Automotive Lighting Systems appeared first on From Silicon To Sof...

featured video

PCIe Gen5 x16 Running on the Achronix VectorPath Accelerator Card

Sponsored by Achronix

In this demo, Achronix engineers show the VectorPath Accelerator Card successfully linking up to a PCIe Gen5 x16 host and write data to and read data from GDDR6 memory. The VectorPath accelerator card featuring the Speedster7t FPGA is one of the first FPGAs that can natively support this interface within its PCIe subsystem. Speedster7t FPGAs offer a revolutionary new architecture that Achronix developed to address the highest performance data acceleration challenges.

Click here for more information about the VectorPath Accelerator Card

featured paper

Algorithm Verification with FPGAs and ASICs

Sponsored by MathWorks

Developing new FPGA and ASIC designs involves implementing new algorithms, which presents challenges for verification for algorithm developers, hardware designers, and verification engineers. This eBook explores different aspects of hardware design verification and how you can use MATLAB and Simulink to reduce development effort and improve the quality of end products.

Click here to read more

featured chalk talk

Energy Storage: The Key to Sector Coupling

Sponsored by Mouser Electronics and Phoenix Contact

Climate change is making better energy storage more important than ever before. In this episode of Chalk Talk, Dr. Rüdiger Meyer from Phoenix Contact joins me to discuss the what, where and how of energy storage systems. We take a closer look at the structure and components included in typical energy storage systems and the role that connectors play in successful energy storage systems.

Click here for more information about Phoenix Contact Energy Storage Solutions