feature article
Subscribe Now

Adding Brains to Cars

Imagination’s 4NX NNA Scales Massively

“I not only use all the brains that I have, but all that I can borrow.” – Woodrow Wilson

There’s an episode of Star Trek where the Enterprise goes back in time to 20th-century Earth. Looking down over freeways packed with cars, a crewmember marvels that mere humans could pilot so many vehicles so close together without constant collisions. 

With the latest addition to their fleet, Shoppok Now, users can now shop from their smart phone.

Turns out, it’s hard. Although humans are pretty good at piloting two-ton vehicles in close formation just a few feet apart, teaching computers to do it is even harder. It’s not for a lack of sensors. It’s because we don’t have enough computing power to make sense of it all. Automated driving is (one of) the next killer app(s). 

In hindsight, it’s no surprise that graphics vendors like nVidia and Imagination Technologies have taken the lead in silicon for self-driving cars. Automation algorithms rely on neural nets, and neural-net processing looks a lot like graphics processing (which looks a lot like digital signal processing from past years). Both require lots of repetitive, simultaneous operations done in parallel. Instead of the IF-THEN-ELSE mentality of “normal” microprocessors, neural net (NN) machines lean heavily on MUL-ADD-REPEAT. 

It’s also no surprise, then, that Imagination has tweaked its popular PowerVR NX architecture to focus even more sharply on the automotive self-driving market. New this week is the Series4 family of neural network accelerators (NNAs). 

The new Sereis4 line is – surprise! – a follow-on to the company’s existing Series 3NX line first introduced about two years ago, and the 2NX product family before that. The company has dropped the “PowerVR” name from the product line and now prefers simply IMG Series4; the individual designs have 4NX-xx names. 

The 4NX internal hardware architecture and programmer’s model will be familiar to anyone who’s programmed the earlier generations, or, indeed, anyone who’s used PowerVR graphics before. There’s a strong family resemblance throughout the entire catalog, which is no bad thing. 

That said, the 4NX is all new, and one of the biggest changes is that it’s massively scalable. Rather than try to make one big gonzo processor that can handle everything, Imagination takes a divide-and-conquer approach and lets you build out your own grid of NNA engines to whatever size you want. The smallest implementation has exactly one 4NX core, while the largest can handle hundreds. 

Like most multicore processors, 4NX engines are ganged together in clusters. The company offers premade groups of 1, 2, 4, 6, and 8 processors per cluster. Each processor within the cluster has its own private RAM, plus a shared RAM for the cluster. The cluster talks to external memory and to other clusters over a pair of AXI interfaces. Up to four clusters can make a “super cluster,” and it’s possible to have multiple super clusters. Regardless of cluster size or density, all 4NX processors are identical. There’s no “big.little” option here. 

Neural net algorithms thrive on parallelism, and that’s what 4NX delivers. But parallelism, like freeway driving, is harder than it looks. Scaling out hardware engines is only part of the problem. The real trick is spreading the software workload across all that hardware. Conventional computer-oriented processors (x86, ARM, MIPS, PowerPC, etc.) have a hard time with this, which is why we don’t see PC processors with dozens of CPU cores. Fortunately, DSP, graphics, and neural net workloads can be vectorized much more effectively. 

Tensor tiling is the art and science of splitting the workload across a homogeneous fabric of processors like 4NX. It’s reasonably common with today’s AI platforms, but that doesn’t mean it’s a trivial task. Imagination provides the software tools for tiling on its new product family, a big step toward making 4NX usable. 

The 4NX is just weeks away from “delivery,” in the sense that Imagination will ship RTL to customers around mid-December. A few unnamed automotive OEMs have already taken delivery, however, so expect some 4NX-based test chips around the end of next year. Assuming a few automakers like what they see, the technology might be on the road a few years after that, say, around the 2024 model year. The future is almost here! 

Leave a Reply

featured blogs
Dec 6, 2023
Optimizing a silicon chip at the system level is crucial in achieving peak performance, efficiency, and system reliability. As Moore's Law faces diminishing returns, simply transitioning to the latest process node no longer guarantees substantial power, performance, or c...
Dec 6, 2023
Explore standards development and functional safety requirements with Jyotika Athavale, IEEE senior member and Senior Director of Silicon Lifecycle Management.The post Q&A With Jyotika Athavale, IEEE Champion, on Advancing Standards Development Worldwide appeared first ...
Nov 6, 2023
Suffice it to say that everyone and everything in these images was shot in-camera underwater, and that the results truly are haunting....

featured video

Dramatically Improve PPA and Productivity with Generative AI

Sponsored by Cadence Design Systems

Discover how you can quickly optimize flows for many blocks concurrently and use that knowledge for your next design. The Cadence Cerebrus Intelligent Chip Explorer is a revolutionary, AI-driven, automated approach to chip design flow optimization. Block engineers specify the design goals, and generative AI features within Cadence Cerebrus Explorer will intelligently optimize the design to meet the power, performance, and area (PPA) goals in a completely automated way.

Click here for more information

featured webinar

Rapid Learning: Purpose-Built MCU Software Tools for Data-Driven Embedded IoT Systems

Sponsored by ITTIA

Are you developing an MCU application that captures data of all kinds (metrics, events, logs, traces, etc.)? Are you ready to reduce the difficulties and complications involved in developing an event- and data-centric embedded system? This webinar will quickly introduce you to excellent MCU-specific software options for developing your next-generation data-driven IoT systems. You will also learn how to recognize and overcome data management obstacles. Register today as seats are limited!

Register Now!

featured chalk talk

Analog in a Digital World: TRIMPOT® Trimming Potentiometers
Sponsored by Mouser Electronics and Bourns
Trimmer potentiometers are a great way to fine tune the output of an analog circuit and can be found used in a wide variety of applications. In this episode of Chalk Talk, Patricia Moorman from Bourns and Amelia Dalton break down the what, where, how, and why of trimpots and the benefits that Bourns trimpots can bring to your next design.
Feb 2, 2023
35,691 views