feature article
Subscribe Now

Duct Tape, FPGAs, and the Art of Making Great Multi-Purpose Tools

Most engineers will agree that duct tape is an excellent multi-purpose tool.  This wonder product has been used for everything from giving tennis balls the feel of a cricket ball, to saving the Apollo 13 mission from certain disaster.  Engineers love good multi-purpose tools because of the sheer versatility that they offer; a good multi-purpose tool can help a creative engineer get themselves out of a real bind.

To hardware designers, FPGAs are also excellent multi-purpose tools.  No other “off-the-shelf” semiconductor can become so many different things to different people.  The super-versatile logic cell architecture of the typical FPGA allows it to be used for everything from image enhancement on the Mars rover, to a life-saving patient heart monitor.  But FPGAs have been changing.

When FPGA technology first emerged, the concept was pretty simple: build an array of general-purpose logic cells that can be programmed to produce any possible logic configuration. This approach worked well for simple designs but was limited in its ability to handle more complicated designs. For example, many designs require large amounts of memory, and using only these general-purpose cells to create memory arrays is very inefficient. The designer was forced to use off-chip memories when using FPGAs in a design with large memory requirements, increasing BOM cost and PCB footprint.  Programmable logic device vendors responded to these changing customer requirements by introducing the special resource stripes that appear in most modern FPGA architectures.  Putting a column of purpose-built RAMs among the logic columns made these programmable devices practical for a much larger set of designs, allowing memory needs to be met on-chip.  Likewise, customers complained about poor Fmax performance when attempting to synthesize their multiplication-heavy DSP designs to an FPGA.  Many architectures now include purpose-built multipliers or DSP blocks that, besides being area efficient, can operate at a much faster frequency than a corresponding circuit built with fabric logic.  Given the rapid adoption of the dedicated resource stripes in current device offerings, future devices are sure to contain even more dedicated resource IP.

One of the essential tools in any FPGA designer’s belt is their FPGA synthesis tool.  On the whole, the heuristics used by synthesis tools do a great job of managing the trade-offs inherent in resource allocation. The tool can look for a mapping to achieve the best area savings.  For example, mapping larger RAM constructs within a design to the available block RAMs, and smaller RAM structures into fabric. Alternatively, timing can also be given consideration. In certain cases, logic on the critical path of a design is best implemented with the programmable logic cells; in other situations, only the use of the dedicated resources will allow the design to meet performance goals.

But while synthesis tools can use heuristics to determine the best implementation for most designs, they will never have all of the knowledge the design engineer possesses. In certain cases it could be possible for the designer to obtain superior results by guiding the synthesis tool with a specific resource assignment, using these dedicated resources as multi-purpose tools.  The problem has been that until recently, these dedicated resource IP blocks have not been terribly good multi-purpose tools.  They have had great potential to become so, given a bit of creativity.  For instance, a RAM block can sometimes be used to implement a shift-register, a DSP block can implement a counter or even a multiplexer.  While these might not be the most ideal uses for a dedicated resource, at least, according to traditional optimization heuristics, they could be advantageous when you’re in a pinch.  But the analysis and mapping control required to elicit good multi-purpose use of dedicated resources was lacking—until now.

Recent advances in programmable logic synthesis technology make it possible for designers to actually use these dedicated resources as effective multi-purpose tools.  Here’s how it works: when you bring your design into the synthesis environment, part of the compile process is a step whereby the synthesis tool examines your design looking for arithmetic and datapath “operators”, such as multipliers, counters, multiplexers, shift registers, memories, etc.

Before performing actual synthesis to the target technology, the designer can use a type of resource manager to examine what dedicated resources are available on the target device, and which operators are recognized by the synthesis tool as mappable to those dedicated resources.  For instance, when targeting a device with on-chip RAMs and DSP blocks, you would be given two views—one of all available RAM blocks on the chip along with all operators that could map to those RAM blocks, and another of all available DSP blocks on the chip along with all operators that could map to those DSP blocks.

Next, the designer can browse the various operators listed for each resource type, cross-probing to either the HDL source code or an RTL schematic, to familiarize themselves with where each operator is situated within the overall design, what is the size of operation (e.g. 8-bit or 18-bit multiplier) and what is the clock period constraining that operator.  The designer can also see a predictive summary count of all the dedicated resources forecast to be used by the current resource assignment.  So for example if the synthesis tool’s auto-assignment would result in 8 DSP blocks and 2 RAM blocks being used, the designer would have up-front knowledge of this before actually performing a full synthesis of the design.  Having advanced knowledge of any potential resource scarcity can alert the designer that they may find advantage in reviewing all assignments to that scarce resource, to ensure that the most appropriate operators are allocated to the dedicated resources.

Once the designer has reviewed operator assignments, they can make specific assignments of some or all of their operators to a particular dedicated resource, proceeding to have the synthesis tool perform a heuristic-based auto-assignment on the remaining operators during synthesis of the design to the target technology.  Voila!  At long last you have easy control over making once “use-it-or-lose-it” IP into great multi-purpose tools.

Most synthesis tools do a great job of finding the optimal use of the available device resources in mapping a design.  While you might not need to use this sort of resource management every day, it is certainly a valuable tool to have in your arsenal.  The creative freedom given to hardware designers by this sort of capability is quite impressive.  Now if only someone could find a way to make an actual FPGA out of duct tape…

20071030_mentor_fig1.jpg
Patent-applied-for resource management technology in Precision RTL Plus identifies available architectural blocks and assists in re-mapping implementations for the best performance and device utilization.
 

Leave a Reply

featured blogs
Apr 20, 2018
https://youtu.be/h5Hs6zxcALc Coming from RSA Conference, Moscone West, San Francisco (camera Jessamine McLellan) Monday: TSMC Technology Symposium Preview Tuesday: Linley: Training in the Datacenter, Inference at the Edge Wednesday: RSA Cryptographers' Panel Thursday: Qu...
Apr 19, 2018
COBO, Silicon Photonics Kevin Burt, Application Engineer at the Samtec Optical Group, walks us through a demonstration using silicon photonics and the COBO (Consortium of Onboard Optics) form factor.  This was displayed at OFC 2018 in San Diego, California. Samtec is a fable...