Digital signal processing has traditionally been the domain of DSP processors and ASICs. Since the late 1990s, FPGAs have emerged as alternative options for DSP designers. FPGAs are a good fit for applications that demand higher performance than what DSP processors can offer, yet do not meet the criteria to justify ASIC economics.
FPGAs Make Their Mark on Signal Processing
FPGAs have evolved from those where DSP structures were built using logic-only cells to those having dedicated embedded DSP structures, such as dynamically reconfigurable XtremeDSP slices in Xilinx Virtex-4 FPGAs. Such FPGAs incorporate tremendous parallel processing capability. For example, new Xilinx Virtex-4 FPGAs support as many as 512 MAC engines, each capable of providing up to 500MHz throughput. Figure 1 shows how FPGAs for signal processing have evolved over the years.
With such horsepower available in today’s FPGAs, the appeal of using them for demanding algorithm goes beyond traditional DSP programmers. System engineers, hardware engineers, and algorithm developers all crave the performance benefits that FPGAs provide, yet have different design needs to be addressed by design tools. In addition, for some tasks it may be more economical and productive to develop solutions that combine both FPGAs and DSP processors to leverage legacy C algorithms for processors and higher performance through parallelism, as offered in FPGAs. Before analyzing how design tools for FPGAs have evolved and will evolve, it is worth understanding the audience for which such tools are intended.
Figure 1. The Evolution of FPGAs for Signal Processing
Pivotal Roles in Today’s Design Teams
For large system development targeted for FPGAs, there will be at least four pivotal roles: System Architect, Algorithm Developer, Software Developer and Hardware Designer. Each position plays a special role on the team, and while their tasks may overlap, they all have different requirements.
System Architect – The system architect or system designer is involved in a product from its genesis, analyzing or developing the requirements and determining to what degree the project will be based on legacy design or new architecture. Some of the most critical design decisions regarding system performance, size and form factor are:
–Hardware/software tradeoffs – which parts of the design must be implemented in embedded code and which parts are best implemented in the FPGA fabric?
–Processor selection – On-chip choices include general-purpose cores like the Xilinx the PowerPC, and off-chip choices include general-purpose processors or digital signal processors such as those from TI.
–Hardware blocks – What are the major blocks of the design? Which blocks should be developed in-house, and which should be acquired as IP?
System architects require tools that that allow them to work at a high level and also facilitate the integration the different portions of the design developed by other members of the team. As a result, textual languages such as MATLAB and C have traditionally been popular choices for system architects. More recently, the graphically-oriented Simulink product has emerged as a popular choice.
Algorithm Developer – The algorithm developer is responsible for developing the mathematical foundation for the design. This is the person who typically understands the issues of the algorithms and determines how to deliver sufficient fidelity to meet specifications without creating inefficiencies. The algorithm developer must be able to quickly evaluate different forms of the algorithms and explore the solution space to see which alternative is most able to meet system requirements like throughput and latency. In complex multi-rate designs, the algorithm developer will typically design each portion of the algorithm and work with the system architect to integrate them together on the FPGA.
The algorithm developer will normally choose the tool that matches his or her preferred style and provide an array of mathematical building blocks and visualization tools. The tool must provide floating-point and fixed-point mathematics and should provide a mechanism for converting from to fixed-point to meet the developers’ needs. For concisely describing DSP algorithms, domain-specific languages like the MATLAB language provide the best productivity.
Software Developer – The software developer or DSP programmer is the team member most concerned with issues of coding DSPs and general-purpose processors. They most commonly use C and may also use assembly language to exploit additional performance. The software developer is well-versed in how to efficiently code in C to avoid bottlenecks (such as unnecessary memory accesses) and knows how to make effective use of the processor’s internal registers. The programmer’s job is most heavily impacted by the choice of processor and memory architecture, and these choices generally drive the selection of the tools he uses.
Hardware Engineer – Today the role of the FPGA designer has expanded to include qualifying externally acquired IP, verifying the IP within the rest of the circuit, evaluating reuse of existing blocks and developing new blocks based on requirements. Often the requirements come in the form of simulations run in tools like MATLAB and Simulink. However, these requirements must be put into RTL to ultimately verify them for bit-correctness and timing accuracy. The hardware designer has the knowledge to take the idealized concepts of the algorithm developer and system architect and translate them into high-performance hardware. The hardware engineer will naturally want a complete RTL design and verification environment.
A number of design methodologies have emerged over the years to allow designers to more efficiently and rapidly target FPGAs for their signal processing tasks. We have chosen to group these by language type.
HDL to FPGA tools – FPGA implementation tools such as Xilinx ISE and Quicklogic’s QuickWorks were originally developed to help designers effectively develop logic circuits, since that was the best fit for FPGAs. Intellectual property in the form of IP cores has enabled FPGAs to be used for networking and signal processing tasks, yet traditional FPGA implementation tools have evolved only to the point where they allow effective “stitching together” of such cores at a relatively low level of abstraction. As the main purpose of such design tools was to enable hardware engineers optimal speed/area tradeoffs and implementations, these tools did not typically offer powerful system modeling capabilities for signal processing system-level design.
MATLAB to FPGA tools – MATLAB has been the preferred language for algorithm development for many DSP engineers. While it served as a nice language for modeling, designers were not able to target FPGA hardware until a few years ago, with the advent of products from AccelChip. The AccelChip DSP Synthesis tool enables developers to synthesize RTL from MATLAB source, and is integrated with the AccelWare IP toolkits of DSP cores.
AccelChip’s tools provide broad language support for MATLAB, including its strengths in vector-, array- and matrix-based operations. AccelChip also helps automate conversion from floating-point to fixed-point and provides tools for optimizing QoR. AccelChip’s support is limited to MATLAB entry, however, so it can’t synthesize hardware from other tools, including Simulink, and AccelChip does not generate C as an output for use with general-purpose processors or DSPs.
Simulink to FPGA tools – Simulink was created as a system-level design tool that enabled designers to exploit visual data flows. Simulink encourages a modular design style, with emphasis on easy design creation and powerful data analysis features. Its capability for allowing designers to create testbenches and visualize data flows helped increase its popularity rapidly.
With program extensions such as Real-Time Workshop, Simulink also offers a path to generating C code for TI DSP processors. More recently, designers can also automatically generate HDL for FPGAs through third-party tools such as Xilinx System Generator for DSP.
C to FPGA tools – Two of the leading tools that enable designers to target FPGA using C are Mentor Graphics® Catapult™ C and Celoxica’s C to FPGA product. Celoxica introduced the notion of hardware and software development using a common methodology and language – its proprietary Handel-C is a super-set of ANSI-C and includes extensions for exploiting FPGA architectural features such as parallelism, bit manipulation, clocks and RAM/ROM.
Mentor’s Catapult C Synthesis product uses pure, untimed C++ to automatically generate optimized RTL descriptions. It allows hardware designers to fully and interactively explore the micro-architecture and interface design space and reduce RTL implementation time.
While certainly appealing to the C-knowledgeable software developer, C-based design tools for FPGAs have had their limitations for signal processing tasks. C-based tools typically include limited support for many common DSP algorithms and data visualization and impose tool-specific coding styles for synthesis. In addition, just because a software programmer has C at his fingertips does not necessarily make him a hardware engineer that knows how to fully exploit the FPGA architecture. FPGA constructs such as look up tables, shift-register logic, and pipelining all need to be understood for implementing designs that deliver high performance economically. Cost has also been a limiting factor to the wide acceptance of C-based tools for FPGAs.
Moving Towards Plug-and-Play Design Methods
With such diverse needs in today’s typical design team, perhaps the ideal tool is one that brings together multiple capabilities and abstractions, and in doing so essentially lets the designer work in the language of the problem. Such a tool would need to allow algorithmic modeling, provide efficient implementation, enable simple debug and verification and demonstrably show productivity enhancements for the team.
To date there have not been any original attempts at developing such integrated tool suites. Perhaps the nearest attempt was made by Cadence with its Signal Processing Worksystem tool, now licensed by CoWare. SPW was traditionally targeted at ASIC designers, and priced well above the level of FPGA tools. Its open architecture allowed simulation and integration of MATLAB, C/C++, SystemC, Verilog and VHDL, with capability for mixed-language simulations when combined with Cadence’s NC-Sim tool. Yet SPW lacked some key requirements for FPGA design, such as compiler technology, FPGA IP integration with FPGA flows and, until recently, support for the Windows platform.
More recently the FPGA industry has experienced a trend towards plug-and-play design methodologies. These allow designers only to purchase the tools that are needed for the problem at hand and design teams to use their preferred methodology based on the available team’s skill set, without paying for unnecessary capabilities.
Xilinx’s System Generator for DSP tool was the first tool to bridge the gap between the world of the system engineer and the hardware designer. By seamlessly integrating with Simulink, System Generator leveraged a powerful v isual data flow environment ideally suited for modeling digital algorithm signal flow graphs, and allowed the designer to generate bit- and cycle-accurate hardware implementation from the system model automatically. The success of this design flow was quickly followed by other tools that also interfaced to Simulink and targeted FPGAs, such as SynplifyDSP from Synplicity.
The System Generator/Simulink design flow provides an “easy on-ramp” for system architects to develop FPGA-based signal processing systems. Although System Generator has evolved considerably to enable powerful co-simulation (hardware and HDL simulation) and debug interfaces, it and other Simulink-based tools still fall short in serving of some of the needs of the algorithm developers described earlier – in particular, the need to handle matrix operations and vector-based processing that are handled better in languages like MATLAB. Examples of such algorithms include linear algebra, which involves matrix inverse and factorization operations, and complex number operations such as calculating magnitude/angle and normalizing complex numbers.
The need to address the needs of the different members of the design team has again encouraged the development of another interface between the Accelchip tool and Xilinx System Generator for DSP (see Figure 2). This interface now allows the hardware designer, system engineer, and algorithm developer to work together within one framework. It allows the algorithm developer to begin with a MATLAB M-file; use the AccelChip tools to optimize the design; and export the design to System Generator, where the system engineer can integrate the design. From System Generator the complete design can be verified and synthesized to FPGAs. This integration lets algorithm developers keep their MATLAB M-files as their golden source, and opens up the ability of using MATLAB and AccelChip to generate new IP blocks.
Figure 2. System Generator/AccelChip Interface
Although RTL verification is a key step in the design flow, it is universally true that designers want to see algorithms running in hardware. System Generator’s hardware-in-the-loop co-simulation interfaces make this a push-button flow, and allow you to bring the full power of MATLAB and Simulink analysis functions to hardware verification.
The next logical step for tools integration seems to be the incorporation of C-based programming tools into existing frameworks such as Simulink/Accelchip/System Generator. In doing so, different members of today’s design team will truly be able to work in the language of the problem, yet design within the constraints of a unified design framework. Such a frame work would enable seamless modeling between MATLAB, Simulink and C, and also allow the generation of efficient RTL through MATLAB, Simulink, HDL or C.