feature article
Subscribe Now

Not For Software Engineers

HLS Keeps Hardware Engineers in Business

Wolfgang froze when he saw it. He had heard rumors that such things might be possible, that his job could one day be at risk, but he had more or less blown it off. Too many other things to worry about. And then… there it was. An automatic translation facility on the web.

He made his living painstakingly translating documents to and from German. He had always been convinced that computers could never do what humans did. But now he wasn’t so sure. Computers had completely trashed other careers that used to require experts: graphic art, music, animation… anyone with a few bucks could get a program and do a mediocre job. And lots of people were satisfied with mediocre because it’s cheap. And so the skilled people found themselves competing with fresh-out-of-college minimum-wage newbies by the dozens. It wasn’t a career anymore, it was just mass labor. Pushing a mouse. And now it looked like foreign language translation was going in the same direction.

Dejected, he decided to test it out a bit. At a loss for something meaningful to translate, he grabbed that familiar pan-literal English sentence, “The quick brown fox jumped over the lazy dog.” He used the automatic translator to render it into German: “Der schnelle braune Fuchs ist über dem faulen Hund gesprungen.” Hmmm… maybe not exactly how he would have put it but also not as horrible as he was hoping for. Then he had another idea: he translated that German translation back to English: “The fast brown fox is go bad jumped over that dog.” And he started to relax, even treating himself to a tentative self-satisfied chuckle.

Feeling like he was on a roll, he took the German, translated that to French, then Arabic and European Spanish, in turn, before returning to English. A global game of telephone. And things got curiouser and curiouser:

“Le renard brun rapide a saut? au-dessous du chien paresseux.”

He decided to be a nice guy and help out a bit, changing it manually to:

“Le renard brun rapide a sauté au-dessous du chien paresseux.”

From which he then proceeded, without any manual intervention, to:

“?? ????? ???? ?????? ??? ??? ?? ??? ??? ???????.”

“El programa acelerado que recorrer aceleradamente a U sus.”

“The rapid program that to travel through rapidly to OR its.”

He closed his browser. His job was safe. Maybe this would be useful for saving some time, but he was far from being replaced.*

The dream of high-level synthesis (HLS) has always been that a software engineer could write some code, push a button, and have it automatically converted into a hardware implementation that worked. No layout, no DRC checks, no FPGA constraints, no RTL. OK, all those things would be there, but they would be buried safely under a layer of reliable tools that took care of all the grotty bits. And not only would the code just work, it would be rendered into an efficient implementation, not using ridiculous amounts of hardware, taking advantage of any available parallelism so that execution would be faster.

Of course, if such a tool ever comes around, it may never be purchased, since typically that buying decision would come as a result of the hardware team validating that the results of the tool were worthy of eliminating their jobs. That’s not likely to happen. Even so, as the HLS market heats up, hardware engineers appear to have as much work ahead of them as ever.

 

This year’s DAC featured a day-long tutorial on HLS. If there was one glaring conclusion to be drawn, it was that this is not a market to be entered lightly; you better have teams of grad students working on algorithms, a long history of acquiring benchmark designs, and a fat wallet, because there’s a lot of sweat going into efforts to eke the most megahertz out of the fewest square microns of silicon, and there will be gallons of blood on the benchmark floors. But the other clear conclusion, if we can allow ourselves two, was succinctly articulated by ST’s Nitin Chawla: “HLS is not for software engineers.”

Despite years of effort (HLS is the shiny new moniker for what used to be called C-to-RTL and frees the technology from the demons of past marketing attempts), not only must C code be written with hardware considerations in mind to yield reasonable synthesis results, but even the target technology must be considered, and then hints and scripts and sweet whispers must be provided to the tools to cajole them into giving up their coveted best secrets.

The result is a world that would appear completely foreign to a software engineer. And the case is made abundantly clear by Mentor’s recent improvements to Catapult C, wherein control logic can now be synthesized. Say that to a software engineer and you’ll get some serious head-scratching. How could you not synthesize control logic??

To a software engineer, control logic would presumably be those control flow bits in a program. If/then/else blocks,while and for loops, switch statements. Everything else is just blocks of sequential code. And you can’t synthesize only the linear stuff; that would make no sense.

To a hardware engineer, things look different. Control flow elements can be synthesized, but to a greater or lesser extent. It’s well known that code with too many decision forks in it can’t be parallelized well if those decisions have to be made in a particular order. There are too many dependencies, and progress stalls while waiting for all those approvals. Likewise, loops may or may not be parallelizable. It depends on whether each loop iteration is independent of the others or whether the loop termination can be determined at compile time. So the control flow elements in and of themselves aren’t what keep the hardware engineers awake at night.

What hardware engineers do is take a system that may be defined partly or wholly in C (very few, if any, systems are written completely in C) and split out the algorithmic parts, stitching them together with hardware. It’s your standard data/control bifurcation. Synthesize the datapath and design the control to make it all work.

Mentor actually describes three different kinds of control, two of which are implicit in the code. The first is what they call “intra-block control.” This is the low-level detail needed to make an algorithmic data block run. The datapath might use a pipeline, or there may be shared resources that have to be handled correctly within the block. Logic has to be built to guard against resource deadlock or race conditions or to initialize the pipeline or any other elements whose start-up semantics aren’t explicit.

The second is what they call “multi-block dataflow control.” The example they give here is a ping-pong memory manager, and this is a classic hardware-oriented element. How a memory is managed will not be part of the software algorithm; that’s left for the operating system and the underlying compute platform. A hardware engineer, however, might improve throughput by splitting a memory in two and alternately accessing one and then the other – ping-ponging back and forth. It takes control logic to do that.

Mentor’s Catapult C has been able to infer these first two kinds of control logic from C code for a few years now. But the third kind has the ominous-sounding name “reactive inter-block control.” It’s not really clear from the name just what this is, but they give arbiters, cache controllers, memory controllers, bus interfaces, and dispatchers as examples.

There are a couple of ways of characterizing these functions. One of the ways Mentor gives is the fact that, even though the functionality of each of these blocks can be written in C, the entire thing needs to execute at once, not sequentially. Stated another way, algorithmic code tends to be “blocking” – that is, things have to wait around for other results to complete – while these control blocks are “non-blocking” – statements don’t wait around for other statements before proceeding. Another way of looking at it is that, while algorithmic blocks are timed by the availability of data, these control blocks are timed by a clock – a hardware entity if there ever was one. Looked at from a micro-architecture level, data blocks might use a pipeline to get things done, whereas these control blocks need to be state machines, with all state transitions computed on each clock cycle. More hardware centricity.

Many of these characteristics apply to all types of control blocks. What appears to differentiate the third type is that some kind of policy is involved. You might be able to infer that an arbiter is needed, but you can’t infer the arbiter policy. Is it priority-based? Round-robin? That has to be described somewhere, and if your design is written in C or C++, it’s nice to include the policies in the same language. But they can’t be synthesized like algorithms – they have to be synthesized like control logic.

Mentor has handled this by defining a class that can be used for control definitions, synthesizing the appropriate non-blocking logic. But, popping back up several levels here, while C (or, in this case, C++, since we’re talking classes) is being synthesized, this is not the code that some software guy wrote. This is hardware being written in C++. By hardware engineers. Any functionality originally described in C or C++ by a software engineer will be picked to bits, with algorithmic portions teased out and control logic being explicitly defined in the code as needed before numerous attempts are made at getting a suitable result from the synthesis tool.

All of this is not to suggest that Mentor is doing the wrong thing. In fact, their results may be quite good; that’s for users to decide. But it simply points out that, when it comes to designing hardware, it’s still very much a hardware designer’s world. Software engineers may continue to mull over those five most dangerous words ever, “How hard can it be?” And tools and hardware guys will continue to demonstrate that, while it’s gotten easier, it’s still hard. Hard to do well, anyway. And this market, unlike others, will not tolerate mediocrity. Skilled engineers will continue to be employed.

*The translations shown were actually taken from an online translation site. I did not make them up.

Link: Mentor Catapult C

Leave a Reply

featured blogs
Mar 28, 2024
The difference between Olympic glory and missing out on the podium is often measured in mere fractions of a second, highlighting the pivotal role of timing in sports. But what's the chronometric secret to those photo finishes and record-breaking feats? In this comprehens...
Mar 26, 2024
Learn how GPU acceleration impacts digital chip design implementation, expanding beyond chip simulation to fulfill compute demands of the RTL-to-GDSII process.The post Can GPUs Accelerate Digital Design Implementation? appeared first on Chip Design....
Mar 21, 2024
The awesome thing about these machines is that you are limited only by your imagination, and I've got a GREAT imagination....

featured video

We are Altera. We are for the innovators.

Sponsored by Intel

Today we embark on an exciting journey as we transition to Altera, an Intel Company. In a world of endless opportunities and challenges, we are here to provide the flexibility needed by our ecosystem of customers and partners to pioneer and accelerate innovation. As we leap into the future, we are committed to providing easy-to-design and deploy leadership programmable solutions to innovators to unlock extraordinary possibilities for everyone on the planet.

To learn more about Altera visit: http://intel.com/altera

featured chalk talk

TE Connectivity MULTIGIG RT Connectors
In this episode of Chalk Talk, Amelia Dalton and Ryan Hill from TE Connectivity explore the benefits of TE’s Multigig RT Connectors and how these connectors can help empower the next generation of military and aerospace designs. They examine the components included in these solutions and how the modular design of these connectors make them a great fit for your next military and aerospace design.
Mar 19, 2024
1,277 views