feature article
Subscribe Now

Horses for Courses

HSA Foundation Works to Treat All Processors Equally

Can’t we all just get along? The DSP programmers don’t talk to the CPU programmers; the GPU programmers won’t sit at the lunch table with the device-driver programmers. Hey, gang. We’re all nerds. Let’s work together.

That’s the altruistic but daunting task of the Heterogeneous Systems Architecture (HSA) Foundation, a nonprofit group formed to define hardware and software standards that allow all types of processors, regardless of race, creed, religion, orientation, or country of origin, to play well with others.

As Bryon Moyer so ably described in May, the HSA specification is complex, convoluted, and complicated. It’s also a work in progress, as these things don’t throw themselves together quickly. The HSA Foundation has released version 1.0 of the specification, so it’s real. You can download it and design hardware around it. But your new system won’t have much company for a while.

To recap, the purpose of HSA is to spread a layer of commonality across all types of processors – CPU, GPU, DSP, you name it – so that they’re all treated more or less as equals. By way of contrast, today’s PCs treat the main processor (likely a multicore x86 chip) very differently from the graphics processor (likely a massively multicore chip in its own right). That’s a persistent remnant of the old-school way of designing computers with a single main CPU and one or more peripheral controllers, accelerators, or satellite processors. The operating system and the main applications run on the main CPU; the additional processors get tossed specialty tasks consistent with their architecture.

That’s all well and good, but these days we tend to mix our processors a bit more closely. Even a basic cellphone has at least two processors (CPU and GPU), and special-purpose embedded systems have had mixed processor types for years. Wouldn’t it be nice if we could program these things as if they were true multiprocessor systems, rather than the conventional arm’s-length-and-a-bus relationship between different processors?

HSA isn’t trying to make all processors generic and interchangeable, by any means. They just want to hammer out some rules, guidelines, specifications, and best practices for mixing processors in a system. Want to do it in your own proprietary way? Fine; no HSA police will enforce their vision of system architecture on you. But if you’d like to leverage the work of a group of smart people who’ve given this some real thought, HSA 1.0 might be worth reading and following.

Like many consortia, HSA’s members are made up of people with real jobs, mostly in the processor or IP business. Founding members include AMD, ARM, and Imagination Technologies. What do those three companies have in common? Go ahead, I’ll wait… That’s right: they all make both CPUs and GPUs, and thus have experience with, and a vested interest in, making dissimilar processors work together. Ever since AMD acquired ATI, it has worked mightily to combine the two into a kind of super-processor (what AMD calls an APU, or advanced processing unit) that offers a one-two punch in a single package. ARM, of course, licenses its Cortex processors and Mali graphics engines, while cross-town rival Imagination makes both the MIPS CPU and the PowerVR GPU. Most smartphones and tablets today combine ARM’s CPU with Imagination’s GPU (or vice versa, in a few cases), so mixing competitive offerings is nothing new.

This week the group gave a public status report and a kind of group high-five. The specification is published, the first few HSA-compliant systems are out in the wild, and some member companies are working on more to come.

Most of the partner presentations were of the “coming soon” variety. Lead presenter AMD talked about a nearly real chip, its sixth-generation A-series microprocessor code-named Carrizo. That 12-core beast has four x86 processors and eight GPUs, plus all the usual internal plumbing (now with HSA compliance!) to keep it all flowing smoothly. 

ARM provided some platitudes regarding HSA in general, without missing the opportunity to point out that it already supports heterogeneous processing with its Cortex CPUs and Mali GPUs.

Next up, ARM licensee MediaTek showed a series of Kindergarten-quality block diagrams showing how it would combine “big CPUs” with “little CPUs,” a not-so-subtle reference to its IP provider’s preferred big.Little dual-core architecture.

Imagination Technologies made much the same pitch as ARM, albeit with different brand names, reminding everyone that its MIPS processors already work quite happily alongside its PowerVR GPUs. It finished with a “coming soon” slide that promised a “staged roll-out from 2016 onwards,” so HSA-compliant IP from imagination is now in the pipeline. 

What does HSA compete with? You, mostly. Makers of heterogeneous systems have had to cobble together their own multi-headed architectures based on shared memory, message passing, or whatever seemed expedient at the time. It’s obviously worked; we have existence proof of that. So proprietary and home-grown solutions are HSA’s biggest opposition.

The other bugaboo is the belief that standardization layers like HSA’s are terrible performance sinks. How can you abstract away differences in processor architecture and memory maps without incurring huge software overhead or reducing differentiation? Won’t HSA turn my system into a lumbering, top-heavy house of cards that treats every processing element as the lowest common denominator?

Nope. Not unless you design it that way. HSA’s guidelines and standards don’t remove any details or differentiation from your system. They merely provide alternatives if you want them. Still want to program down to the bare metal on your GPU? Great; HSA does nothing to stop you. Still want to bit-twiddle your DSP or hard-code your CPU? Fantastic. But if you’d rather treat your disparate processing elements as more-or-less equal partners sharing a memory map and a power supply, HSA has some good ideas about how you might abstract away some of the gory details.

It’s a lot like DirectX, Microsoft’s ever-evolving graphics-driver architecture. At first, high-performance game programmers decried DirectX as a fat, slow layer of buggy bloatware that got between them and their artisanal GPU code. Later, DirectX evolved into a useful abstraction layer that nevertheless allowed hard-core coders to touch the metal if they really wanted to. Today, DirectX is almost universally used to bring sanity to a very complex world of incompatible GPU architectures and daily driver updates.

As Marty Johnson, HSA representative from AMD says, “If you want write-once, run-anywhere code, HSA can get you there. But if you prefer coding to bare metal, HSA supports that, too. It doesn’t turn everything to vanilla. HSA gives you more options, not fewer.”

I like vanilla as much as the next guy. But choices are good, too. And HSA provides a choice for simplicity and sanity. That’s going to be more important as processors, and multiprocessor systems, get more complicated. 

Leave a Reply

featured blogs
Aug 14, 2018
I worked at HP in Ft. Collins, Colorado back in the 1970s. It was a heady experience. We were designing and building early, pre-PC desktop computers and we owned the market back then. The division I worked for eventually migrated to 32-bit workstations, chased from the deskto...
Aug 14, 2018
'€œPrediction is difficult, especially the future.'€ '€” Niels Bohr Okay, in my post last week , I revealed that I was a deterministic Newtonian, and my reasoning was about two hundred years old. I posited, '€œIf I could identify all the forces and weights and measur...
Aug 14, 2018
Introducing the culmination of months of handwork and collaboration. The Hitchhikers Guide to PCB Design is a play off the original Douglas Adams novel and contains over 100 pages of contains......
Aug 9, 2018
In July we rolled out several new content updates to the website, as well as a brand new streamlined checkout experience. We also made some updates to the recently released FSE locator tool to make it far easier to find your local Samtec FSE. Here are the major web updates fo...
Jul 30, 2018
As discussed in part 1 of this blog post, each instance of an Achronix Speedcore eFPGA in your ASIC or SoC design must be configured after the system powers up because Speedcore eFPGAs employ nonvolatile SRAM technology to store its configuration bits. The time required to pr...