feature article
Subscribe Now

Arm Unwraps Three New CPU/GPU Designs

The Spirit of ’76 is Alive and Well in Cambridge, England

“Diversity: the art of thinking independently together.” – Malcolm Forbes

Japanese-owned Arm is celebrating the start of summer with three new ’76-themed IP cores: Cortex-A76, Mali-G76, and Mali-V76. It’s almost like they’re declaring independence from the competition.

You know the drill. New Arm cores are faster, more power-efficient, and occasionally even smaller than their predecessors. That’s mostly true in this case as well. The new Cortex-A76 processor is better in every way compared to the -A75; Mali-G76 is faster and smarter than the -G72; and Mali-V76 is ridiculously more proficient than the -V61 it replaces.

What hasn’t changed is Arm’s swagger. (The company now prefers to style its name in mixed case, rather than the previous all-caps acronym.) Every major Arm presentation is prefaced with a McDonald’s-like “billions and billions served” announcement. At last count, almost 130 billion Arm-based chips have seen the light of day, with over 21 billion of those made in just the previous calendar year. So far in 2018, Arm reckons its licensees have cranked out another 10 billion, or about half of last year’s total in a five-month span. At this rate, the headcount will top 200 billion sometime in 2021. The boardroom graphs are all pointing up and to the right.

That’s a lot of IP, and you don’t get there by putting just one CPU or one GPU in every cellphone. Nosirree, you’ve got to make IP cores for every purse and purpose, as Alfred P. Sloan used to say. Phones and other mobile devices now have multiple CPUs clustered together, and now they’re clustering multiple GPUs, as well. You’re just not au courant unless you’ve got five or six Arm-designed engines chugging away inside your pocket.

Cortex-A76 is the company’s new flagship microprocessor, and Arm says it’s got double the performance of “currently shipping smartphones” that are based on the Cortex-A73 (not the faster -A75). Compared to its closer sibling, the -A75, the -A76 is expected to crank out about 35% more benchmarks. Much of that improvement is due to architectural tweaks that the company declined to describe. The rest is down to process technology. Arm bases its claims on an -A76 running at 3.0 GHz in 7nm technology versus an -A75 in 10nm at 2.8 GHz, so about 10% of that uplift is due to clock speed, not architecture. Still, any design that improves performance by twenty-some percent is a significant upgrade. Arm calls it “laptop performance in a cellphone power envelope.”

The new -A76 still adheres to the current-generation Armv8-A architecture specification, so there are no software-visible changes to the CPU. It’s also compatible with the company’s newish DynamIQ internal bus interface, so you’re allowed to cluster multiple CPU cores together. In fact, it’s expected.

Helping you to up the core count in your next SoC is the Mali-G76 graphics processor. It’s an upgrade from the -G72, obviously, with about the same bump in performance as the new Cortex. The new -G76 ought to be 30% faster than a -G72, and/or 30% more power-efficient, assuming both cores are running at the same speed in the same process technology. Here again, the changes are microarchitectural, with no outward alterations to the programmer’s model.

Inside, the -G76 has anywhere from four to 20 shader units (your choice), with three execution engines in each. Those engines are themselves upgraded from those found in the -G72, with texture mapping especially improved. Although the respectable 30% bump in performance is nice, it’s the core’s machine-learning (ML) prowess that really takes a leap. Arm says the -G76 is 2.7 times faster than the -G72, due largely to new hardware support for 8-bit integer dot products.

The third point in Arm’s celebratory three-cornered hat is Mali-V76. It’s your way to help Arm achieve its 200-billion-unit goal. Designed for video, as opposed to graphics, the -V76 is a screamer streamer. The target applications here are AR/VR goggles, high-end TVs, and video walls displaying multiple independent streams. The -V76 is Arm’s first core able to decode 8K UHD content, but it can also be configured to show a 2×2 array of 2160-pixel streams at 60 fps, or a 4×4 array of 1080p video at 60 fps. The latter presentation appeals to makers of video kiosks and high-density information displays – or to television addicts with short attention spans.

Because the -V76 can stream both ways and encode as well as decode (though the capabilities aren’t quite symmetrical), it’s also applicable to AR goggles that must source, as well as sink, video data.

All three new 76-themed cores are available now, in the sense that Arm will happily take your money and license the IP to you. Indeed, a few unnamed licensees are already so equipped, and have even produced silicon, which means they’ve got a one-year head start. End-user products containing one or more of these new IP cores are expected “sometime in 2019,” according to Arm.

Arm is the undisputed master of the universe when it comes to licensed CPU cores; somewhat less so in the GPU arena, where it competes with PowerVR, Vivante, and other options. Part of Arm’s attraction is its huge portfolio of options, including low-end IP to round out your SoC design. And part of it is Arm’s huge size and financial stability, when other IP vendors seem to be struggling to stay afloat. And, now that Arm has CPUs, GPUs, and video processors spread all over the performance spectrum, it can preassemble complex subsystems for you and license that, too. Instant product: just add software. The Anglo-Japanese company is in a good position with a bright future. What better reason to light off a few fireworks?

4 thoughts on “Arm Unwraps Three New CPU/GPU Designs”

  1. I came for the segmented cache bus concurrency upgrade and the smoked tea, and I’m all out of smoked…data directives posing as arbitrary operators, maybe? Is there not an inset image of the likely IP changes over the variant universe and a fab roadmap detail just yet; or is the plan that Qualcomm ships samples for a ‘done one better’ banner, then the Arm strategy people can pick the other way to compile chips and cue llvm?

Leave a Reply

featured blogs
Jul 17, 2018
In the first installment, I wrote about why I had to visit Japan in 1983, and the semiconductor stuff I did there. Today, it's all the other stuff. Japanese Food When I went on this first trip to Japan, Japanese food was not common in the US (and had been non-existent in...
Jul 16, 2018
Each instance of an Achronix Speedcore eFPGA in your ASIC or SoC design must be configured after the system powers up because Speedcore eFPGAs employ nonvolatile SRAM technology to store the eFPGA'€™s configuration bits. Each Speedcore instance contains its own FPGA configu...
Jul 12, 2018
A single failure of a machine due to heat can bring down an entire assembly line to halt. At the printed circuit board level, we designers need to provide the most robust solutions to keep the wheels...