feature article
Subscribe Now

Catapult C

Mentor Announces Architectural Synthesis

Electronic design automation has its own secret little cold fusion. An innovation that everyone quietly hopes is possible but publicly disavows. A development that would make life beautiful, dogs and cats live happily together, and money grow on trees. This missing link is “behavioral synthesis,” the direct compilation of untimed algorithmic descriptions into practical hardware architectures. Once this is possible, digital hardware designers, the micro-architectural mavens that create much of the magic in today’s ASIC and FPGA designs, will no longer be necessary. All of their relevant expertise, tricks and techniques will be encapsulated in a powerful software application that will dash out optimized datapaths on Monday morning without a sip of caffeine, and crank out perfectly designed hardware all week without asking for a raise or a better 401-K package. Any competent software engineer will be able to fish a function out of their latest C++ application and recompile it for hardware implementation with a 1000X performance boost.

While this prospect may sound both thrilling and terrifying to experts in VHDL- and Verilog-based hardware design, there has never been reason for serious concern. The technical challenges posed by this problem have created in EDA something akin to Fermat’s Last Theorem, with a similar number of false announcements of success. There have been, in fact, enough premature, false, and exaggerated claims that the term “behavioral synthesis” has become so maligned that marketers won’t touch it. New products that use some behavioral synthesis technology are simply described as “automatically creating RTL” or “from algorithm to architecture”.

The standard for success in behavioral synthesis has remained the same for the past fifteen years. A practical system will be able to automatically generate hardware, from an algorithmic description, that rivals the quality of hand-written RTL created by a competent designer. Much like the problem posed by Fermat, the answer is not a single, simple, elegant solution, but rather a complex compilation of work in many areas attempting to emulate the best creative tactics of leading hardware architects.

The Catapult C system announced by Mentor Graphics this week may move us one more step toward that goal. According to Mentor, Catapult C creates “optimized ASIC/FPGA hardware from untimed C++.” The two keywords here that could represent significant progress are “optimized” and “untimed”. Most approaches to C and C++ hardware generation to date have relied on pseudo-timed input with specialized libraries adapting C and C++ to hardware design by adding scheduling constraints and other hardware-specific information into the source description. Mentor’s approach, by working from completely untimed algorithmic descriptions, gives the compiler the maximum flexibility in creating a hardware architecture that is optimized for the design goals of the project. It also means that C or C++ targeted at hardware is more like the generic code that a software developer would normally write.

According to early adopters of Catapult C, the product is capable of creating results that rival and sometimes beat hand-coded RTL. “Our ability to achieve a 31 percent reduction in gate count, which correlates closely to silicon real estate and power consumption, speaks for itself,” said Peter Nord, Project Leader EDA and Methodology Coordination, Ericsson Mobile Platforms.

“We were impressed by the results. The fact that we could synthesize our untimed, system-level C/C++ source code with minimal modification played an important role in the success of this project. It provided a precise path from our system-level models all the way to RTL, which allowed us to meet our required design goals in significantly less time,” said Rudolf Krumenacker, vice president, system-on-chip design, Siemens ICN.

While quality of results is always important in a synthesis tool, algorithmic or behavioral synthesis compounds the problem. A bad result with an RTL synthesis tool might be 20-50% from optimum, but in behavioral synthesis it is not uncommon to get results that miss the mark by 10-100X. The impact on design size and performance from the architectural-level decisions made during behavioral synthesis are far greater than the narrow range of possibility offered by register-level optimization methods. For this reason, it is important to be able to control results to meet the area or performance needs of a particular application. Catapult C provides the designer with a convenient interface that shows the impact of architectural decisions on both performance and chip area and plots various solutions on a graph for easy comparison.

One of the key problems in algorithmic synthesis is creation of the interface to the behavioral module. The I/O interface to the module imposes constraints that limit the architectural options available to the tool. For this reason, many academic attempts at behavioral-level design that seemed promising fell short in practical use. While a tool could often produce near-optimal hardware architectures for a particular algorithm, the process broke down when real-world I/O constraints were applied at the boundaries. Mentor has attacked this problem with a patent-pending interface synthesis technology that creates a wrapper around the algorithmic code, bridging the gap between the algorithm and any external hardware. On the outside, the wrapper manages the interface with many popular standards such as AMBA bus, and on the inside, the wrapper constrains the Catapult C synthesis system to generate hardware architectures that are optimized for the limitations and timing of the chosen interface. Interface synthesis also allows the designer to switch from one external interface to another and generate optimized hardware for each without modifying the algorithmic C source code.

Catapult C uses a library builder tool to collect characterization data for the target silicon implementation fabric and the chosen RTL synthesis tool. This allows Catapult C to create RTL code that is highly optimized for the particular targets, reducing the need for end-of-cycle timing debug.

While Catapult C may represent a significant step forward in algorithmic compilation technology, hardware engineers can still rest easy in their jobs. Catapult C presents an interface that is easily accessible to the hardware architect, but probably still is somewhat confusing to the software developer. While concepts such as loop pipelining, dataflow dependencies, parallelism, latency, and throughput may seem foreign to most sequential-thinking C programmers, RTL designers will find themselves at home using this tool that is analogous to switching from an axe to a chain-saw for RTL development. Using the interactive feedback, automation, and control facilities of Catapult C, it is easy to see that the company’s claims of significant productivity boosts for RTL designers are well founded.

The solution to Fermat’s theorem was long and protracted. Rather than one breakthrough development, it was like the gradual construction of a bridge where the completion of each span brought the ends closer together. So it is likely to be with behavioral synthesis. Catapult C represents one more significant span in the bridge that will someday link the design automation process seamlessly from algorithm to hardware and facilitate levels of productivity never before seen in hardware design.

Leave a Reply

featured blogs
May 8, 2024
Learn how artificial intelligence of things (AIoT) applications at the edge rely on TSMC's N12e manufacturing processes and specialized semiconductor IP.The post How Synopsys IP and TSMC’s N12e Process are Driving AIoT appeared first on Chip Design....
May 2, 2024
I'm envisioning what one of these pieces would look like on the wall of my office. It would look awesome!...

featured video

Why Wiwynn Energy-Optimized Data Center IT Solutions Use Cadence Optimality Explorer

Sponsored by Cadence Design Systems

In the AI era, as the signal-data rate increases, the signal integrity challenges in server designs also increase. Wiwynn provides hyperscale data centers with innovative cloud IT infrastructure, bringing the best total cost of ownership (TCO), energy, and energy-itemized IT solutions from the cloud to the edge.

Learn more about how Wiwynn is developing a new methodology for PCB designs with Cadence’s Optimality Intelligent System Explorer and Clarity 3D Solver.

featured paper

Altera® FPGAs and SoCs with FPGA AI Suite and OpenVINO™ Toolkit Drive Embedded/Edge AI/Machine Learning Applications

Sponsored by Intel

Describes the emerging use cases of FPGA-based AI inference in edge and custom AI applications, and software and hardware solutions for edge FPGA AI.

Click here to read more

featured chalk talk

Introducing QSPICE™ Analog & Mixed-Signal Simulator
Sponsored by Mouser Electronics and Qorvo
In this episode of Chalk Talk, Amelia Dalton and Mike Engelhardt from Qorvo investigate the benefits of QSPICE™ - Qorvo’s Analog & Mixed-Signal Simulator. They also explore how you can get started using this simulator, the supporting assets available for QSPICE, and why this free analog and mixed-signal simulator is a transformational tool for power designers.
Mar 5, 2024
9,142 views