feature article
Subscribe Now

PowerVR AX2185 Accelerates Neural Nets

New IP Cores Aimed at Smartphones, Cameras, Consumer Goods

“The greatest danger of AI is that people conclude too early that they understand it.” — Eliezer Yudkowsky

Sometimes accidental discoveries are the best ones. Teflon was supposed to be a refrigerant. The first heart pacemaker was designed as a measuring device, but inventor Wilson Greatbatch put in the wrong resistor value. And Play-Doh was created to clean wallpaper.

Turns out, the graphics card in your PC is surprisingly good – almost accidentally talented – at neural-net processing, cryptocurrency mining, machine learning, and artificial intelligence. Who knew? With a few tweaks, your GPU can make a darned good robot brain, even outsmarting the “real” microprocessor in your system.

This bit of serendipity hasn’t gone unnoticed by the world’s GPU designers, of course. Never ones to let moss grow under their feet, companies like nVidia, AMD/ATI, and Imagination Technologies have rapidly pivoted their GPU architectures to capitalize on these new and interesting markets. Now you can buy GPUs to use as, well, GPUs, or you can buy them for completely different purposes.

This week’s announcement comes from Imagination, the keepers of the PowerVR flame. They’ve split their popular GPU family into two completely different architectures, one for traditional graphics tasks and another for accelerating neural networks. They both share the PowerVR brand name, but that’s about all they have in common.  

The first broad outlines of this came last year, when Imagination announced its PowerVR 2NX architecture. So, we knew that Imagination had an accelerator in the works. Now we know their names and what they look like.

Say hello to AX2145 and AX2185. They’re the first two instantiations of the new 2NX architecture, and they’re pretty similar. One’s designed for maximum performance (the ’85), while the other is a milder, more “balanced” design, according to the company.

Both IP cores are available immediately, and both are, in fact, already being used by a pair of lead customers. Expect to see the first AX21x5-based products on the street in about a year, probably in the form of high-end Chinese smartphones, security cameras, or drones.

Broadly speaking, what separates the AX21xx twins from other AI-focused designs from the major semiconductor vendors is power efficiency. Your nVidia GPU is never going to last long inside a cellphone, so designing for ultimate performance isn’t the goal. Instead, Imagination has to bear in mind that its customers are running on batteries, in confined spaces, and with no good way to cool the hardware. They’re focused on the Internet of Surveillance™, not Call of Duty II.

The AX2185 is the faster of the two designs, and evidently the fastest implementation on the PowerVR roadmap. Imagination says it can perform at 4.1 TOPS (trillions of operations per second), which neatly matches the top end of the family’s performance range when it was announced last year. If that kind of acceleration somehow isn’t good enough for you, you need multiple AX2185s.

The AX2145 is the little sister, with 1.0 TOPS performance, smaller die area, and less power consumption. Imagination sees this as a good fit for midrange smartphones (midrange in a few years, perhaps), digital TVs, and set-top boxes.

Weirdly, Imagination claims that the ’45 actually outperforms the ’85 in certain circumstances. Specifically, when memory bandwidth is tight, you’ll want to use the ’45, not its bigger sibling. That’s because both designs – like all neural-net accelerators – need a boatload of bandwidth to operate efficiently. Like GPUs and DSPs, NNAs are memory hogs, and throttling that memory can have a big effect on the engine’s efficiency. Imagination spent a lot of time benchmarking, and later explaining, why this situation is so.

It also explains why the whole 2NX architecture supports funny bit widths. As our own Bryon Moyer explained back in October, the new PowerVR family is fixed-point only, and supports 8-bit and 4-bit integers, as well as some nontraditional bit sizes, like 5-bit format. Add to that 12-bit, 7-bit, 6-bit, and other integers and you begin to see how hard Imagination worked to preserve memory size and bandwidth.

You can even tweak the bit depth on a layer-by-layer basis, adding extra precision where it’s needed and discarding it where it isn’t. You can also use different formats for weights and for data. It’s all flexible.

In bandwidth-constrained applications, the high-end AX2185 chokes if it’s not fed fast enough, wasting most of that potential performance. This is where the AX2145 outpaces it by as much as 50%, according to the company’s benchmarks.

How bad does your bandwidth have to be for this performance inversion to take place? YMMV, but Imagination hints that a system with bandwidth in the “low single digits” of Gbytes/sec would favor the smaller ’45 over the larger ’85 variant. Conversely, if you can provide “dozens of gigabytes per second” of bandwidth, you’ll be happier with the AX2185.

NNAs like these allow designers to stick more intelligence into end nodes, like security cameras that do their own object recognition. That’s great if you’re a camera designer, because you can charge more for your “smart” camera compared to the dumb ones you sold last year. It’s also good news for the overall system, because now you’re not piping full-resolution, full-rate video down an Ethernet cable to a waiting computer, which then has to analyze all those pixels in real-time on its Intel, nVidia, or AMD processor (a task for which they are ill-suited, I can tell you). The whole system gets smarter, network bandwidth is reduced drastically, power consumption probably goes down, and the chance of someone intercepting your raw video stream pretty much disappears. Everybody wins.

NNAs are like the DSPs of the 1990s: everybody needs one but nobody’s sure how to program them. Every few years, a new DSP application would materialize that promised to catapult DSP chips and software into the mainstream. Modems! Voice recognition! Graphics! Machine vision! Each time, widespread adoption proved elusive and DSPs remained a niche product, ideally suited to frustratingly narrow application areas.

At first, GPUs looked set to follow the same path. Way too many GPU startups tried and failed to break into the mainstream. Only a few, including PowerVR, nVidia, and ATI (now AMD) survived the initial wave of optimism. Like restaurants, GPU companies tend to fail within the first 18–24 months.

Machine learning, artificial intelligence, and convolutional neural networks fell on the GPU industry like manna from heaven. Suddenly, a whole new application area dropped in their laps, ready-made, and nicely suited to existing chips. It’s like discovering that you can sell your floor wax as a dessert topping, too. But that doesn’t mean you can’t improve on the flavor, and so now the second generation of NNAs is emerging from their GPU progenitors. Existing GPUs were handy, plentiful, and affordable, but not quite perfect. Neural nets may have come as a surprise to the industry, but today’s NNA designers are wasting no time in capitalizing on the opportunity.

Leave a Reply

featured blogs
Jun 14, 2021
By John Ferguson, Omar ElSewefy, Nermeen Hossam, Basma Serry We're all fascinated by light. Light… The post Shining a light on silicon photonics verification appeared first on Design with Calibre....
Jun 14, 2021
As a Southern California native, learning to surf is a must. Traveling elsewhere and telling people you’re from California without experiencing surfing is somewhat a surprise to most people. So, I have decided to take up surfing. It takes more practice than most people ...
Jun 14, 2021
The Cryptographers' Panel was moderated by RSA's Zulfikar Ramzan, and featured Ron Rivest (the R of RSA), Adi Shamir (the S of RSA), Ross Anderson (professor of security engineering at... [[ Click on the title to access the full blog on the Cadence Community site. ...
Jun 10, 2021
Data & analytics have a massive impact on the chip design process; we explore how fast/precise chip data analytics solutions improve IC design quality & yield. The post The Importance of Chip Manufacturing & Test Data Analytics in the Semiconductor Industry ap...

featured video

Reduce Analog and Mixed-Signal Design Risk with a Unified Design and Simulation Solution

Sponsored by Cadence Design Systems

Learn how you can reduce your cost and risk with the Virtuoso and Spectre unified analog and mixed-signal design and simulation solution, offering accuracy, capacity, and high performance.

Click here for more information about Spectre FX Simulator

featured paper

An FAQ about the Matter connectivity standard from TI

Sponsored by Texas Instruments

Formerly Project CHIP, Matter is a new connectivity standard that runs on Thread and Wi-Fi network layers to provide a unified application layer for connected devices. Read this article to discover how you can get started with Matter and TI.

Click to read more

Featured Chalk Talk

Bluetooth Overview

Sponsored by Mouser Electronics and Silicon Labs

Bluetooth has come a long way in recent years, and adding the latest Bluetooth features to your next design is easier than ever. It’s time to ditch the cables and go wireless. In this episode of Chalk Talk, Amelia Dalton chats with Mark Beecham of Silicon labs about the latest Bluetooth capabilities including lower power, higher bandwidth, mesh, and more, as well as solutions that will make adding Bluetooth to your next design a snap.

Click here for more information about Silicon Labs EFR32BG Blue Gecko Wireless SoCs