feature article
Subscribe Now

PowerVR AX2185 Accelerates Neural Nets

New IP Cores Aimed at Smartphones, Cameras, Consumer Goods

“The greatest danger of AI is that people conclude too early that they understand it.” — Eliezer Yudkowsky

Sometimes accidental discoveries are the best ones. Teflon was supposed to be a refrigerant. The first heart pacemaker was designed as a measuring device, but inventor Wilson Greatbatch put in the wrong resistor value. And Play-Doh was created to clean wallpaper.

Turns out, the graphics card in your PC is surprisingly good – almost accidentally talented – at neural-net processing, cryptocurrency mining, machine learning, and artificial intelligence. Who knew? With a few tweaks, your GPU can make a darned good robot brain, even outsmarting the “real” microprocessor in your system.

This bit of serendipity hasn’t gone unnoticed by the world’s GPU designers, of course. Never ones to let moss grow under their feet, companies like nVidia, AMD/ATI, and Imagination Technologies have rapidly pivoted their GPU architectures to capitalize on these new and interesting markets. Now you can buy GPUs to use as, well, GPUs, or you can buy them for completely different purposes.

This week’s announcement comes from Imagination, the keepers of the PowerVR flame. They’ve split their popular GPU family into two completely different architectures, one for traditional graphics tasks and another for accelerating neural networks. They both share the PowerVR brand name, but that’s about all they have in common.  

The first broad outlines of this came last year, when Imagination announced its PowerVR 2NX architecture. So, we knew that Imagination had an accelerator in the works. Now we know their names and what they look like.

Say hello to AX2145 and AX2185. They’re the first two instantiations of the new 2NX architecture, and they’re pretty similar. One’s designed for maximum performance (the ’85), while the other is a milder, more “balanced” design, according to the company.

Both IP cores are available immediately, and both are, in fact, already being used by a pair of lead customers. Expect to see the first AX21x5-based products on the street in about a year, probably in the form of high-end Chinese smartphones, security cameras, or drones.

Broadly speaking, what separates the AX21xx twins from other AI-focused designs from the major semiconductor vendors is power efficiency. Your nVidia GPU is never going to last long inside a cellphone, so designing for ultimate performance isn’t the goal. Instead, Imagination has to bear in mind that its customers are running on batteries, in confined spaces, and with no good way to cool the hardware. They’re focused on the Internet of Surveillance™, not Call of Duty II.

The AX2185 is the faster of the two designs, and evidently the fastest implementation on the PowerVR roadmap. Imagination says it can perform at 4.1 TOPS (trillions of operations per second), which neatly matches the top end of the family’s performance range when it was announced last year. If that kind of acceleration somehow isn’t good enough for you, you need multiple AX2185s.

The AX2145 is the little sister, with 1.0 TOPS performance, smaller die area, and less power consumption. Imagination sees this as a good fit for midrange smartphones (midrange in a few years, perhaps), digital TVs, and set-top boxes.

Weirdly, Imagination claims that the ’45 actually outperforms the ’85 in certain circumstances. Specifically, when memory bandwidth is tight, you’ll want to use the ’45, not its bigger sibling. That’s because both designs – like all neural-net accelerators – need a boatload of bandwidth to operate efficiently. Like GPUs and DSPs, NNAs are memory hogs, and throttling that memory can have a big effect on the engine’s efficiency. Imagination spent a lot of time benchmarking, and later explaining, why this situation is so.

It also explains why the whole 2NX architecture supports funny bit widths. As our own Bryon Moyer explained back in October, the new PowerVR family is fixed-point only, and supports 8-bit and 4-bit integers, as well as some nontraditional bit sizes, like 5-bit format. Add to that 12-bit, 7-bit, 6-bit, and other integers and you begin to see how hard Imagination worked to preserve memory size and bandwidth.

You can even tweak the bit depth on a layer-by-layer basis, adding extra precision where it’s needed and discarding it where it isn’t. You can also use different formats for weights and for data. It’s all flexible.

In bandwidth-constrained applications, the high-end AX2185 chokes if it’s not fed fast enough, wasting most of that potential performance. This is where the AX2145 outpaces it by as much as 50%, according to the company’s benchmarks.

How bad does your bandwidth have to be for this performance inversion to take place? YMMV, but Imagination hints that a system with bandwidth in the “low single digits” of Gbytes/sec would favor the smaller ’45 over the larger ’85 variant. Conversely, if you can provide “dozens of gigabytes per second” of bandwidth, you’ll be happier with the AX2185.

NNAs like these allow designers to stick more intelligence into end nodes, like security cameras that do their own object recognition. That’s great if you’re a camera designer, because you can charge more for your “smart” camera compared to the dumb ones you sold last year. It’s also good news for the overall system, because now you’re not piping full-resolution, full-rate video down an Ethernet cable to a waiting computer, which then has to analyze all those pixels in real-time on its Intel, nVidia, or AMD processor (a task for which they are ill-suited, I can tell you). The whole system gets smarter, network bandwidth is reduced drastically, power consumption probably goes down, and the chance of someone intercepting your raw video stream pretty much disappears. Everybody wins.

NNAs are like the DSPs of the 1990s: everybody needs one but nobody’s sure how to program them. Every few years, a new DSP application would materialize that promised to catapult DSP chips and software into the mainstream. Modems! Voice recognition! Graphics! Machine vision! Each time, widespread adoption proved elusive and DSPs remained a niche product, ideally suited to frustratingly narrow application areas.

At first, GPUs looked set to follow the same path. Way too many GPU startups tried and failed to break into the mainstream. Only a few, including PowerVR, nVidia, and ATI (now AMD) survived the initial wave of optimism. Like restaurants, GPU companies tend to fail within the first 18–24 months.

Machine learning, artificial intelligence, and convolutional neural networks fell on the GPU industry like manna from heaven. Suddenly, a whole new application area dropped in their laps, ready-made, and nicely suited to existing chips. It’s like discovering that you can sell your floor wax as a dessert topping, too. But that doesn’t mean you can’t improve on the flavor, and so now the second generation of NNAs is emerging from their GPU progenitors. Existing GPUs were handy, plentiful, and affordable, but not quite perfect. Neural nets may have come as a surprise to the industry, but today’s NNA designers are wasting no time in capitalizing on the opportunity.

Leave a Reply

featured blogs
May 26, 2022
Introducing Synopsys Learning Center, an online, on-demand library of self-paced training modules, webinars, and labs designed for both new & experienced users. The post New Synopsys Learning Center Makes Training Easier and More Accessible appeared first on From Silico...
May 26, 2022
CadenceLIVE Silicon Valley is back as an in-person event for 2022, in the Santa Clara Convention Center as usual. The event will take place on Wednesday, June 8 and Thursday, June 9. Vaccination You... ...
May 25, 2022
There are so many cool STEM (science, technology, engineering, and math) toys available these days, and I want them all!...
May 24, 2022
By Neel Natekar Radio frequency (RF) circuitry is an essential component of many of the critical applications we now rely… ...

featured video

Increasing Semiconductor Predictability in an Unpredictable World

Sponsored by Synopsys

SLM presents significant value-driven opportunities for assessing the reliability and resilience of silicon devices, from data gathered during design, manufacture, test, and in-field. Silicon data driven analytics provide new actionable insights to address the challenges posed to large scale silicon designs.

Learn More

featured paper

Intel Agilex FPGAs Deliver Game-Changing Flexibility & Agility for the Data-Centric World

Sponsored by Intel

The new Intel® Agilex™ FPGA is more than the latest programmable logic offering—it brings together revolutionary innovation in multiple areas of Intel technology leadership to create new opportunities to derive value and meaning from this transformation from edge to data center. Want to know more? Start with this white paper.

Click to read more

featured chalk talk

Har-Modular for PCB Connectivity

Sponsored by Mouser Electronics and HARTING

Did you know that you can create custom modular connector solutions from off the shelf components that are robust, save PCB space and are easy to assemble? In this episode of Chalk Talk, Amelia Dalton chats with Phill Shaw and Nazario Biala from HARTING about the Har-Modular PCB connector system that gives you over a billion combination possibilities for data, signal and power.

Click here for more information about HARTING har-modular PCB Connectors