feature article
Subscribe Now

Arm Unwraps Three New CPU/GPU Designs

The Spirit of ’76 is Alive and Well in Cambridge, England

“Diversity: the art of thinking independently together.” – Malcolm Forbes

Japanese-owned Arm is celebrating the start of summer with three new ’76-themed IP cores: Cortex-A76, Mali-G76, and Mali-V76. It’s almost like they’re declaring independence from the competition.

You know the drill. New Arm cores are faster, more power-efficient, and occasionally even smaller than their predecessors. That’s mostly true in this case as well. The new Cortex-A76 processor is better in every way compared to the -A75; Mali-G76 is faster and smarter than the -G72; and Mali-V76 is ridiculously more proficient than the -V61 it replaces.

What hasn’t changed is Arm’s swagger. (The company now prefers to style its name in mixed case, rather than the previous all-caps acronym.) Every major Arm presentation is prefaced with a McDonald’s-like “billions and billions served” announcement. At last count, almost 130 billion Arm-based chips have seen the light of day, with over 21 billion of those made in just the previous calendar year. So far in 2018, Arm reckons its licensees have cranked out another 10 billion, or about half of last year’s total in a five-month span. At this rate, the headcount will top 200 billion sometime in 2021. The boardroom graphs are all pointing up and to the right.

That’s a lot of IP, and you don’t get there by putting just one CPU or one GPU in every cellphone. Nosirree, you’ve got to make IP cores for every purse and purpose, as Alfred P. Sloan used to say. Phones and other mobile devices now have multiple CPUs clustered together, and now they’re clustering multiple GPUs, as well. You’re just not au courant unless you’ve got five or six Arm-designed engines chugging away inside your pocket.

Cortex-A76 is the company’s new flagship microprocessor, and Arm says it’s got double the performance of “currently shipping smartphones” that are based on the Cortex-A73 (not the faster -A75). Compared to its closer sibling, the -A75, the -A76 is expected to crank out about 35% more benchmarks. Much of that improvement is due to architectural tweaks that the company declined to describe. The rest is down to process technology. Arm bases its claims on an -A76 running at 3.0 GHz in 7nm technology versus an -A75 in 10nm at 2.8 GHz, so about 10% of that uplift is due to clock speed, not architecture. Still, any design that improves performance by twenty-some percent is a significant upgrade. Arm calls it “laptop performance in a cellphone power envelope.”

The new -A76 still adheres to the current-generation Armv8-A architecture specification, so there are no software-visible changes to the CPU. It’s also compatible with the company’s newish DynamIQ internal bus interface, so you’re allowed to cluster multiple CPU cores together. In fact, it’s expected.

Helping you to up the core count in your next SoC is the Mali-G76 graphics processor. It’s an upgrade from the -G72, obviously, with about the same bump in performance as the new Cortex. The new -G76 ought to be 30% faster than a -G72, and/or 30% more power-efficient, assuming both cores are running at the same speed in the same process technology. Here again, the changes are microarchitectural, with no outward alterations to the programmer’s model.

Inside, the -G76 has anywhere from four to 20 shader units (your choice), with three execution engines in each. Those engines are themselves upgraded from those found in the -G72, with texture mapping especially improved. Although the respectable 30% bump in performance is nice, it’s the core’s machine-learning (ML) prowess that really takes a leap. Arm says the -G76 is 2.7 times faster than the -G72, due largely to new hardware support for 8-bit integer dot products.

The third point in Arm’s celebratory three-cornered hat is Mali-V76. It’s your way to help Arm achieve its 200-billion-unit goal. Designed for video, as opposed to graphics, the -V76 is a screamer streamer. The target applications here are AR/VR goggles, high-end TVs, and video walls displaying multiple independent streams. The -V76 is Arm’s first core able to decode 8K UHD content, but it can also be configured to show a 2×2 array of 2160-pixel streams at 60 fps, or a 4×4 array of 1080p video at 60 fps. The latter presentation appeals to makers of video kiosks and high-density information displays – or to television addicts with short attention spans.

Because the -V76 can stream both ways and encode as well as decode (though the capabilities aren’t quite symmetrical), it’s also applicable to AR goggles that must source, as well as sink, video data.

All three new 76-themed cores are available now, in the sense that Arm will happily take your money and license the IP to you. Indeed, a few unnamed licensees are already so equipped, and have even produced silicon, which means they’ve got a one-year head start. End-user products containing one or more of these new IP cores are expected “sometime in 2019,” according to Arm.

Arm is the undisputed master of the universe when it comes to licensed CPU cores; somewhat less so in the GPU arena, where it competes with PowerVR, Vivante, and other options. Part of Arm’s attraction is its huge portfolio of options, including low-end IP to round out your SoC design. And part of it is Arm’s huge size and financial stability, when other IP vendors seem to be struggling to stay afloat. And, now that Arm has CPUs, GPUs, and video processors spread all over the performance spectrum, it can preassemble complex subsystems for you and license that, too. Instant product: just add software. The Anglo-Japanese company is in a good position with a bright future. What better reason to light off a few fireworks?

4 thoughts on “Arm Unwraps Three New CPU/GPU Designs”

  1. I came for the segmented cache bus concurrency upgrade and the smoked tea, and I’m all out of smoked…data directives posing as arbitrary operators, maybe? Is there not an inset image of the likely IP changes over the variant universe and a fab roadmap detail just yet; or is the plan that Qualcomm ships samples for a ‘done one better’ banner, then the Arm strategy people can pick the other way to compile chips and cue llvm?

Leave a Reply

featured blogs
Jul 20, 2024
If you are looking for great technology-related reads, here are some offerings that I cannot recommend highly enough....

featured video

How NV5, NVIDIA, and Cadence Collaboration Optimizes Data Center Efficiency, Performance, and Reliability

Sponsored by Cadence Design Systems

Deploying data centers with AI high-density workloads and ensuring they are capable for anticipated power trends requires insight. Creating a digital twin using the Cadence Reality Digital Twin Platform helped plan the deployment of current workloads and future-proof the investment. Learn about the collaboration between NV5, NVIDIA, and Cadence to optimize data center efficiency, performance, and reliability. 

Click here for more information about Cadence Data Center Solutions

featured chalk talk

Maximizing High Power Density and Efficiency in EV-Charging Applications
Sponsored by Mouser Electronics and Infineon
In this episode of Chalk Talk, Amelia Dalton and Daniel Dalpiaz from Infineon talk about trends in the greater electrical vehicle charging landscape, typical block diagram components, and tradeoffs between discrete devices versus power modules. They also discuss choices between IGBT’s and Silicon Carbide, the advantages of advanced packaging techniques in both power discrete and power module solutions, and how reliability is increasingly important due to demands for more charging cycles per day.
Dec 18, 2023
29,184 views