feature article
Subscribe Now

Just What is Algorithmic Synthesis?

In a traditional FPGA design flow, crafting the hardware architecture and writing VHDL or Verilog for RTL synthesis requires considerable effort. The code must follow a synthesis standard, meet timing, implement the interface specification, and function correctly. Given enough time, a design team is capable of meeting all these constraints. However, time is one thing that is always in short supply. Deadlines imposed by time to market pressures often force designers to compromise, resulting in them to settle for ‘good enough’ by re-using blocks and IP that are over designed for their application.

In the past few years, tools and methodologies that support algorithmic synthesis have risen to help designers build and verify hardware more efficiently, giving them better control over optimization of their design architecture. The starting point of this flow is a subset of pure C++ that includes a bit-accurate class library. The code is analyzed, architecturally constrained, and scheduled to create synthesizable HDL.

For algorithmic intensive designs, this approach for creating verified RTL is an order of magnitude faster than manual methods. So if a designer is currently building and verifying 1,000 gates per day, the same designer using algorithmic synthesis can build and verify 10,000 gates per day.

Beyond Behavioral Synthesis

If all this sounds familiar, that is because behavioral synthesis—introduced with significant fanfare a few years ago—promised the same productivity gains. Reality soon caught up with the hype, however, as designers discovered that behavioral synthesis tools were significantly limited in what they actually did. Essentially, the tools incorporated a source language that required some timing as well as design hierarchy and interface information. As a result, designers had to be intimately familiar with the capabilities of the synthesis tool to know how much and what kind of information to put into the source language. Too much information limited the synthesis tool and resulted in poor quality designs. Too little information lead to a design that didn’t work as expected. Either way, designers did not obtain the desired productivity and flexibility they were hoping to gain.

Removing timing and parallelism from the source language is what separates first-generation (behavioral) high-level synthesis from second-generation (algorithmic) high-level synthesis. Algorithmic synthesis tools decouple complex IO timing from the functionality of the source. This allows the functionality and design timing to be developed and verified independently.

There are a growing number of algorithmic synthesis tools on the market today, making it difficult to sort through their competing claims. Basically, one of the primary differentiators is what language the tool requires the designer to use at a higher level to describe the algorithms. Some rely on languages that include hardware constructs, including Handel-C, SystemC, System Verilog and others. Unfortunately, these languages are extremely difficult to write, and are closer to the RTL abstraction level than higher level languages used to describe system behavior such as C++.

C-Based Algorithmic Synthesis Basics

The most productive algorithmic synthesis tools are based on pure ANSI C/C++, making it possible to develop functional hardware in the form of C algorithms and synthesize process-specific and optimized RTL code. American National Standards Institute (ANSI) C++ is one of the most widely used algorithmic modeling languages in the world. It incorporates all the elements to model algorithms concisely, clearly and efficiently. A non-proprietary class library can then be used to model bit-accurate behavior. And C++ has many software design and debug tools that can now be re-used for hardware design.

With algorithmic synthesis based on pure ANSI C++, the source code doesn’t embed constraints such as clock cycles, concurrency, modules and ports, which would result in a rigid description – verbose and bound to a specific technology. Instead the user can apply synthesis directives to specify the target ASIC or FPGA technology, describe the interface properties, control the amount of parallelism in the design, trade-off area for speed, and more.

Synthesis constraints for the architecture can be applied based on the design analysis. These constraints can be broken into hierarchy, interface, memory, loop and low-level timing constraints.

• Hierarchy constraints allow the sequential design to be separated into a set of hierarchical blocks and define how those blocks run in parallel.
• The interface constraints define the transaction-level communication, pin-level timing and flow control in the design.
• Memory constraints allow the selection of different memory architectures both within blocks and in the communication between blocks.
• Loop constraints are used to add parallelism to each block in the design, including pipelining.
• Low-level timing constraints are available if needed.

Once the design is constrained, it can be scheduled. The result is the designer can quickly infer the appropriate amount of parallelism necessary to meet the performance requirements. This allows the same sequential C++ source to be used for creating highly compact (serial) designs to highly fast (parallel) designs.

At the core of every high-level synthesis tool is a scheduler. Once all the architectural constraints are selected, the scheduler applies the constraints to create a fully timed design. The scheduler is responsible for meeting all the timing constraints, including the clock period. One of the biggest conceptual changes between RTL synthesis and algorithmic synthesis is that the design is not written to run at a specific clock speed, rather the high-level synthesis tool builds a design based on the clock speed constraint. Many tools claim to be high-level synthesis tools, but without a scheduler, they are merely translators and much less powerful than high-level synthesis tools.

Once the scheduler has added time to the design, the RTL output can be generated. The RTL generation involves extracting the datapath, control and FSM for the design. Now the design needs to be verified.

An algorithmic synthesis tool knows the full detail about the timing and structure that is added to a design. This means the tool can also allow the original untimed C++ testbench to be re-used for every output from the tool. In other words, the untimed testbench can be re-used for every architecture created from a C++ design. Once it is verified the design can be run through a standard RTL synthesis flow.

Achieving Optimal, Robust Designs

By reducing the amount of effort spent needed to generate code, algorithmic synthesis gives designers more time for architectural exploration. They can efficiently evaluate alternative implementations, modifying and re-verifying C to effectively perform a series of “what-if” evaluations of alternative algorithms. Users need only to change the constraints within an algorithmic synthesis environment to optimize a design for size, performance, or a variety of other variables. After exploring a range of possible scenarios in a relatively short period of time, the designers can quickly determine an optimal implementation within a reasonable schedule.

Equally important, the quality of the source code is greatly enhanced. Since the lower level code is automatically generated, there are fewer bugs introduced into the design—up to 60% fewer. Automatic synthesis eliminates errors that invariably crop up during manual RTL generation, which is key to improving the overall design cycle.

Higher level synthesis tools are even capable of automatic interface creation. Advanced algorithmic synthesis methodologies are able to directly support interface synthesis, because the source code is sequential. Intelligent algorithmic synthesis tools can create all of the parallelism, concurrency and structure in a hardware design, and generate interface protocols that closely match the dataflow needs of the design. The result is more efficient design with fewer errors. In addition, the resulting interfaces are synthesized to meet, but not dramatically exceed, the performance of the chip, saving on valuable silicon real estate.

To enhance the verification effort, the same high-level description is used to automatically create a consistent verification environment from C to RTL, including high speed transaction-level models, ensuring that the intent specified by the system engineer is preserved. This replaces the slow, manual process of creating models and incrementally adding structure, concurrency and parallelism to develop transaction-level, behavioral and RTL models. Called manual progressive refinement, misinterpretations and syntax errors are introduced as the various models are hand-coded, creating a verification and model maintenance nightmare.

Algorithmic synthesis solves the fundamental problem of creating transaction-level models for new signal processing hardware. Algorithmic synthesis tools automatically generate transaction-level models from a pure ANSI C++ description, adding structure, parallelism and concurrency to create models at various levels of abstraction. These SystemC and System Verilog models, with their hierarchy and parallelism, provide design teams with powerful options for system-level verification. Although transaction-level modeling challenges remain for CPUs, third-party IP and embedded memory, algorithmic synthesis frees the designer from manually creating these models for signal processing hardware, resolving one part of the transaction level model creation problem.


Algorithmic synthesis is the first approach that truly supports practical hardware synthesis from sequential languages to timed RTL. Starting with C++, the same language can now be used for software, hardware and system modeling. By allowing more optimization options, hardware designers consistently achieve better results than hand-coded designs in days rather than months for many datapath intensive designs.

Intelligent algorithmic synthesis tools can create all of the parallelism, concurrency and structure in a hardware design, and can even generate interface protocols that closely match the dataflow needs of the design. The result is more efficient design with fewer errors, helping design teams to produce higher quality designs on the most aggressive development schedules.

About the Author: Bryan Bowyer is a technical marketing engineer in Mentor Graphics’ High-level Synthesis Division. In 1999, he joined Mentor as a developer and has been responsible for advances in interface synthesis technology and design analysis. Bowyer has a B.S in Computer Engineering from Oregon State University.

Leave a Reply

featured blogs
Dec 4, 2020
As consumers, wireless technology is often taken for granted. How difficult would everyday life be without it? Can I open my garage door today? How do I turn on my Smart TV? Where are my social networks? Most of our daily wireless connections – from Wi-Fi and Bluetooth ...
Dec 4, 2020
I hear Percepio will be introducing the latest version of their Tracealyzer and their new DevAlert IoT device monitoring and remote diagnostics solution....
Dec 4, 2020
[From the last episode: We looked at an IoT example involving fleets of semi-trailers.] We'€™re now going to look at energy and how electronics fit into the overall global energy story. Whether it'€™s about saving money on electricity at home, making data centers more eff...
Dec 4, 2020
A few weeks ago, there was a webinar about designing 3D-ICs with Innovus Implementation. Although it was not the topic of the webinar, I should point out that if your die is more custom/analog, then... [[ Click on the title to access the full blog on the Cadence Community si...

featured video

Improve SoC-Level Verification Efficiency by Up to 10X

Sponsored by Cadence Design Systems

Chip-level testbench creation, multi-IP and CPU traffic generation, performance bottleneck identification, and data and cache-coherency verification all lack automation. The effort required to complete these tasks is error prone and time consuming. Discover how the Cadence® System VIP tool suite works seamlessly with its simulation, emulation, and prototyping engines to automate chip-level verification and improve efficiency by ten times over existing manual processes.

Click here for more information about System VIP

featured paper

Tailor-made gateway processors lay the groundwork for zone architectures

Sponsored by Texas Instruments

Automotive suppliers and original equipment manufacturers are heavily investing software R&D efforts on adding new functions and features to achieve autonomy, electrification and connectivity. Still, enabling these functions by adding more electronic control units (ECUs) is not sustainable when it results in increased complexity and cost. There are two ways to consolidate and streamline ECUs within a vehicle…

Keep Reading

Featured Chalk Talk

Thermal Bridge Technology

Sponsored by Mouser Electronics and TE Connectivity

Recent innovations can make your airflow cooling more efficient and effective. New thermal bridges can outperform conventional thermal pads in a number of ways. In this episode of Chalk Talk, Amelia Dalton chats with Zach Galbraith of TE Connectivity about the application of thermal bridges in cooling electronic designs.

More information about TE Thermal Bridge Technology