feature article
Subscribe Now

How the Mighty Have Grown

Synopsys Goes Superscalar with ARC HS40 Processor Core Designs

“We hope that, when the insects take over the world, they will remember with gratitude how we took them along on all our picnics.” – Bill Vaughan

All life on Earth is insects. Statistically speaking, there are so many different species of insects on this planet that all other forms of life – mammals, birds, viruses, plants, algae, you name it – are collectively all just a rounding error.

Similarly, all microprocessors are embedded. So few CPUs go into “computers” like PCs and servers that they might as well not exist (says the guy typing on a laptop PC, whose files are stored on a server somewhere. But stick with me here.)

On the same week that processor behemoth Intel quietly pulled the sheet over Itanium, the little ARC processor from Synopsys graduated to middle school. ARC is almost all growed up, and it’s taking on big-boy responsibilities. Most significantly, it’s now superscalar. Once the province of high-end server processors only, superscalar-ness is now pretty common among embedded CPUs, but this is the first time a member of the ARC family has gotten handed the gold-embossed certificate.

Not just superscalar, but multicore, too! This really is a red-letter day for ARC. Synopsys is sending the new HS4x family out into the world in five different configurations, with single-, dual-, and quad-core implementations and either with or without DSP features packed into its lunchbox. As usual with ARC designs, there are lots of configurable options above and beyond those big ones. Want instruction and data caches? They can do that. Need an MMU? That’s your choice. Feel like floating-point? Check the box on your order form and pull forward to the first window.

And, if you’re looking for something that isn’t on the menu, there’s always the roll-your-own approach. The new HS4x designs retain ARC’s traditional user-configuration ability that lets you add in your own instructions, registers, accelerators, or just about anything else you want, assuming that you’re reasonably handy with a hardware compiler. In the past, ARC customers have built themselves custom crypto engines, unique compression accelerators, or weird “obfuscation units” that just confused onlookers and thwarted reverse-engineering by competitors. (Full disclosure: I used to be employed by ARC before it was acquired by Synopsys.)

Synopsys says its new HS4x cores are 25% faster on integer code than their HS3x predecessors, but as much as 200% faster on DSP code (assuming the DSP option is enabled, of course). Why the dichotomy? Did the DSP design improve that much, or was it just lousy before? And why no 2x improvement in integer code?

The HS4’s integer pipeline is essentially just a longer version of the HS3’s, stretched to 10 stages. That allows the newer cores to hit 1.6–2.2 GHz in 28nm silicon, or 1.9–2.5 GHz in a 16nm FinFET process, according to the company. The longer pipeline enables the higher clock speeds but doesn’t really add any new capabilities to the instruction set over the HS3 generation.

The exception is the dual-issue (superscalar) ability, which bumps up integer performance a little bit, but makes a huge difference to DSP algorithms. With the old single-issue pipeline, the DSP had to rely on the integer pipeline to load and store coefficients, fetch instructions, and execute its control code – an ideal recipe for a bottleneck. The newer core can dedicate half of its pipeline to feeding the DSP unit, loading and storing coefficients all day long, while simultaneously executing integer code on the other half. You’ve now got a DSP unit that’s far more functional and usable, even though the DSP engine itself hasn’t changed much.

Because the HS4x is instruction-set compatible with the HS3x family, you could just move your binaries over to the new CPU, and they’d run. But you probably don’t want to. Unlike, say, an Intel Core i7, ARC processors don’t have hardware assists for aggressively cracking open the execution stream looking for parallelism. That’s the job of the compiler, so, to get the best out of your older ARC code, you’ll want to recompile for the newer pipeline.

Shoppers browsing the aisles looking for a CPU core to license will generally stop at the big ARM display first, before moving on to more budget-friendly options like ARC. Performance, power, and price comparisons are inevitable. Synopsys says its new HS4x delivers twice the performance of ARM’s Cortex-A7, 45% better performance than Cortex-A9, and “higher” performance than Cortex-A17, all while delivering lower consumption than the brand-name British alternatives. It’s not at all clear what configuration options the ARC processors had enabled (nor the ARM processors, for that matter), or what benchmarks the company was using, but at least the slideware numbers give some indication of where the new HS4x cores fit in the overall scheme of things.

In the processor world, there are embedded processors and there are deeply embedded processors – the kind you never see. ARC falls squarely into the latter category. Whereas ARM and MIPS power highly visible consumer items, run “real” operating systems, and have a healthy library of third-party software, ARC (and about a hundred other CPU architectures) toil away in obscurity. That’s not to say they aren’t popular – they’re just concealed. One of ARC’s biggest design wins is in solid-state disks (SSDs), especially the high-end SSDs used in servers and the like. It’s a high-volume design win that’s been good for business but that doesn’t generate many sexy headlines. Synopsys expects the new HS4x family will extend that success, while also gaining sockets in wireless interfaces, automotive systems, and speech-activated interfaces.

Newcomers to the electronics business sometimes comment on how the little black chips look like bugs, with their rows of shiny metal legs. They may be more accurate than they realize.

Leave a Reply

featured blogs
Oct 20, 2020
In 2020, mobile traffic has skyrocketed everywhere as our planet battles a pandemic. Samtec.com saw nearly double the mobile traffic in the first two quarters than it normally sees. While these levels have dropped off from their peaks in the spring, they have not returned to ...
Oct 20, 2020
Voltus TM IC Power Integrity Solution is a power integrity and analysis signoff solution that is integrated with the full suite of design implementation and signoff tools of Cadence to deliver the... [[ Click on the title to access the full blog on the Cadence Community site...
Oct 19, 2020
Have you ever wondered if there may another world hidden behind the facade of the one we know and love? If so, would you like to go there for a visit?...
Oct 16, 2020
[From the last episode: We put together many of the ideas we'€™ve been describing to show the basics of how in-memory compute works.] I'€™m going to take a sec for some commentary before we continue with the last few steps of in-memory compute. The whole point of this web...

featured video

Demo: Low-Power Machine Learning Inference with DesignWare ARC EM9D Processor IP

Sponsored by Synopsys

Applications that require sensing on a continuous basis are always on and often battery operated. In this video, the low-power ARC EM9D Processors run a handwriting character recognition neural network graph to infer the letter that is written.

Click here for more information about DesignWare ARC EM9D / EM11D Processors

featured paper

Fundamentals of Precision ADC Noise Analysis

Sponsored by Texas Instruments

Build your knowledge of noise performance with high-resolution delta-sigma ADCs. This e-book covers types of ADC noise, how other components contribute noise to the system, and how these noise sources interact with each other.

Click here to download the whitepaper

Featured Chalk Talk

Protecting Circuitry with eFuse IC

Sponsored by Mouser Electronics and Toshiba

Conventional fuses are rapidly becoming dinosaurs in our electronic systems. Finally, there is circuit protection technology that doesn’t rely on disposable parts and molten metal. In this episode of Chalk Talk, Amelia Dalton chats with Jake Canon of Toshiba about eFuse - a smart solution that will get rid of those old-school fuses once and for all.

Click here for more information about Toshiba efuses