feature article
Subscribe Now

Duct Tape, FPGAs, and the Art of Making Great Multi-Purpose Tools

Most engineers will agree that duct tape is an excellent multi-purpose tool.  This wonder product has been used for everything from giving tennis balls the feel of a cricket ball, to saving the Apollo 13 mission from certain disaster.  Engineers love good multi-purpose tools because of the sheer versatility that they offer; a good multi-purpose tool can help a creative engineer get themselves out of a real bind.

To hardware designers, FPGAs are also excellent multi-purpose tools.  No other “off-the-shelf” semiconductor can become so many different things to different people.  The super-versatile logic cell architecture of the typical FPGA allows it to be used for everything from image enhancement on the Mars rover, to a life-saving patient heart monitor.  But FPGAs have been changing.

When FPGA technology first emerged, the concept was pretty simple: build an array of general-purpose logic cells that can be programmed to produce any possible logic configuration. This approach worked well for simple designs but was limited in its ability to handle more complicated designs. For example, many designs require large amounts of memory, and using only these general-purpose cells to create memory arrays is very inefficient. The designer was forced to use off-chip memories when using FPGAs in a design with large memory requirements, increasing BOM cost and PCB footprint.  Programmable logic device vendors responded to these changing customer requirements by introducing the special resource stripes that appear in most modern FPGA architectures.  Putting a column of purpose-built RAMs among the logic columns made these programmable devices practical for a much larger set of designs, allowing memory needs to be met on-chip.  Likewise, customers complained about poor Fmax performance when attempting to synthesize their multiplication-heavy DSP designs to an FPGA.  Many architectures now include purpose-built multipliers or DSP blocks that, besides being area efficient, can operate at a much faster frequency than a corresponding circuit built with fabric logic.  Given the rapid adoption of the dedicated resource stripes in current device offerings, future devices are sure to contain even more dedicated resource IP.

One of the essential tools in any FPGA designer’s belt is their FPGA synthesis tool.  On the whole, the heuristics used by synthesis tools do a great job of managing the trade-offs inherent in resource allocation. The tool can look for a mapping to achieve the best area savings.  For example, mapping larger RAM constructs within a design to the available block RAMs, and smaller RAM structures into fabric. Alternatively, timing can also be given consideration. In certain cases, logic on the critical path of a design is best implemented with the programmable logic cells; in other situations, only the use of the dedicated resources will allow the design to meet performance goals.

But while synthesis tools can use heuristics to determine the best implementation for most designs, they will never have all of the knowledge the design engineer possesses. In certain cases it could be possible for the designer to obtain superior results by guiding the synthesis tool with a specific resource assignment, using these dedicated resources as multi-purpose tools.  The problem has been that until recently, these dedicated resource IP blocks have not been terribly good multi-purpose tools.  They have had great potential to become so, given a bit of creativity.  For instance, a RAM block can sometimes be used to implement a shift-register, a DSP block can implement a counter or even a multiplexer.  While these might not be the most ideal uses for a dedicated resource, at least, according to traditional optimization heuristics, they could be advantageous when you’re in a pinch.  But the analysis and mapping control required to elicit good multi-purpose use of dedicated resources was lacking—until now.

Recent advances in programmable logic synthesis technology make it possible for designers to actually use these dedicated resources as effective multi-purpose tools.  Here’s how it works: when you bring your design into the synthesis environment, part of the compile process is a step whereby the synthesis tool examines your design looking for arithmetic and datapath “operators”, such as multipliers, counters, multiplexers, shift registers, memories, etc.

Before performing actual synthesis to the target technology, the designer can use a type of resource manager to examine what dedicated resources are available on the target device, and which operators are recognized by the synthesis tool as mappable to those dedicated resources.  For instance, when targeting a device with on-chip RAMs and DSP blocks, you would be given two views—one of all available RAM blocks on the chip along with all operators that could map to those RAM blocks, and another of all available DSP blocks on the chip along with all operators that could map to those DSP blocks.

Next, the designer can browse the various operators listed for each resource type, cross-probing to either the HDL source code or an RTL schematic, to familiarize themselves with where each operator is situated within the overall design, what is the size of operation (e.g. 8-bit or 18-bit multiplier) and what is the clock period constraining that operator.  The designer can also see a predictive summary count of all the dedicated resources forecast to be used by the current resource assignment.  So for example if the synthesis tool’s auto-assignment would result in 8 DSP blocks and 2 RAM blocks being used, the designer would have up-front knowledge of this before actually performing a full synthesis of the design.  Having advanced knowledge of any potential resource scarcity can alert the designer that they may find advantage in reviewing all assignments to that scarce resource, to ensure that the most appropriate operators are allocated to the dedicated resources.

Once the designer has reviewed operator assignments, they can make specific assignments of some or all of their operators to a particular dedicated resource, proceeding to have the synthesis tool perform a heuristic-based auto-assignment on the remaining operators during synthesis of the design to the target technology.  Voila!  At long last you have easy control over making once “use-it-or-lose-it” IP into great multi-purpose tools.

Most synthesis tools do a great job of finding the optimal use of the available device resources in mapping a design.  While you might not need to use this sort of resource management every day, it is certainly a valuable tool to have in your arsenal.  The creative freedom given to hardware designers by this sort of capability is quite impressive.  Now if only someone could find a way to make an actual FPGA out of duct tape…

Patent-applied-for resource management technology in Precision RTL Plus identifies available architectural blocks and assists in re-mapping implementations for the best performance and device utilization.

Leave a Reply

featured blogs
Mar 5, 2021
The combination of the figure and the moving sky in this diorama -- accompanied by the music -- is really rather tasty. Our cats and I could watch this for hours....
Mar 5, 2021
Explore what's next in automotive sensors, such as the roles of edge computing & sensor fusion and impact of sensor degradation & software lifecycle management. The post How Sensor Fusion Technology Is Driving Autonomous Cars appeared first on From Silicon To Softw...
Mar 5, 2021
Design companies often work with multiple PCB fabricators and each fabricator may have a different set of DFM rules. It is a customary practice followed by design companies to create a common... [[ Click on the title to access the full blog on the Cadence Community site. ]]...
Mar 3, 2021
In grade school, we had timed math quizzes. With a sheet full of problems and the timer set, the goal was to answer as many as possible. The key to speed is TONS of practice and, honestly, memorization '€“ knowing the problems so well that the answer comes to mind at first ...

featured paper

The Basics of Using the DS28S60

Sponsored by Maxim Integrated

This app note details how to use the DS28S60 cryptographic processor with the ChipDNA™. It describes the required set up of the DS28S60 and a step-by-step approach to use the asymmetric key exchange to securely generate a shared symmetric key between a host and a client. Next, it provides a walk through on how to use the symmetric key to exchange encrypted data between a Host and a Client. Finally, it gives an example of a bidirectional authentication process with the DS28S60 using an ECDSA.

Click here to download the whitepaper

Featured Chalk Talk

Cadence Celsius Thermal Solver

Sponsored by Cadence Design Systems

Electrical-thermal co-simulation can dramatically improve the system design process, allowing thermal design adaptation to be done much earlier. The Cadence Celsius Thermal Solver is a complete electrical-thermal co-simulation solution for the full hierarchy of electronic systems from ICs to physical enclosures. In this episode of Chalk Talk, Amelia Dalton chats with CT Kao of Cadence Design Systems about how the Celsius Thermal Solver can help detect and mitigate thermal issues early in the design process.

More information about Celsius Thermal Solver