feature article
Subscribe Now

ARM’s Race Escalates with Cortex-A9

In military parlance, an Osprey is a propeller-driven airplane that takes off and lands vertically, like a helicopter. The Osprey tilts its wings 90 degrees, the props pull it straight up, and the wings flip back again for conventional flight. Clever engineering, but a bit ungainly to look at.

Over in the less dangerous but equally contentious microprocessor world, ARM has also hatched its own Osprey, this one officially named the ARM Cortex-A9. The new A9 will be capable of 2-GHz clock rates, an unheard-of speed for an ARM core. The previous-generation A8 was barely able to make 1 GHz (see Embedded Technology Journal, July 28, 2009, “Better, Stronger, Faster), and even that required some silicon sleight of hand from Intrinsity. At 2 GHz, the new A9 becomes the most potent weapon from the ARM’s dealer.

How did ARM achieve that speed? You might say it was by fixing the A8, but that would be ungracious. A more charitable analysis shows that the A9 actually has a shorter pipeline than the A8, usually the antithesis of high-speed efficiency. But the A8’s pipeline was deliberately long to tolerate implementation-specific tweaks. It also allowed for some margin in ARM’s first-ever superscalar core. Now that the company (and its licensees) has some experience fabricating superscalar ARM cores, it could fine-tune the logic paths, trim some margin, upgrade the core, and still produce a faster version that’s also more capable.

Improvements include a better floating-point unit and out-or-order execution. The A8 was in-order only, even though it was dual-issue superscalar. The A9 also comes in one-, two-, and four-core variations; the customer can choose the number of cores at implementation time. Although ARM doesn’t officially see it that way, the A8 now looks like a bit of a tentative first step into the world of superscalar Cortex implementations. The company expects that some customers will still license the A8 instead of the A9, probably opening a pricing gap between them.

A Soft Pitch and a Hard Core

Like most ARM cores, the A9 will be delivered either as “soft IP” or as a tuned hard macro. The company says it’s signed 15 licensees so far, even though the final version of the IP won’t be available until late this year. The list includes the usual suspects: Texas Instruments, Ericsson, nVidia, NXP, and Toshiba, as well as a gaggle of anonymous customers. Expect the A9 to appear first in mobile devices (a traditional ARM stronghold), then in automotive “infotainment” systems and living-room electronics such as HDTVs. This last product category is traditionally MIPS territory, so the A9 is a clear shot across the rival RISC vendor’s bow.

Like most of the recent high-end ARM processors, the A9 comes with an assortment of coprocessors baked right in. Jazelle acts as a Java accelerator; TrustZone aids security; Neon speeds up graphics. These accelerators are always part of the microprocessor core and can’t be removed to save either space or power. There’s little reason to: they account for just a small fraction of the silicon real estate (much smaller than the caches, for example), and consume next to no power when they’re not being used. Your software is free to ignore them, of course.

ARM Flexes Its Bicep

So how fast is the new A9? Um, pretty fast. Benchmarking a naked processor core, sans memory and I/O, is a speculative endeavor at best. ARM claims 10,000 Dhrystone MIPS, which is a bit like saying your jet fighter can do four million furlongs per fortnight. It might be accurate for some set of circumstances but it’s still largely meaningless.

The A9 should hit high triple digits with no problem in a modern 45nm process optimized for power efficiency. With all the knobs turned the other way, 2 GHz is doable, although power consumption will quadruple, from insignificant to merely tiny. 

For what it’s worth, an x86 processor like Intel’s Atom N270 delivers about 2000 Dhrystone MIPS at 1.6 GHz, so ARM is claiming something like 5x the performance at a 25% faster clock frequency. Again, that’s a specific implementation (in this case, a low-cost netbook computer) versus a disembodied processor core… but the difference is stark nonetheless. And it’s a given that the A9 will consume less power, no matter how it’s fabricated.

Using the more reliable EEMBC CoreMark test, the A8 has about a 2x advantage over the Atom N270. In other words, it delivers about the same performance at half the clock speed. And one-fifth the power. Not a bad tradeoff, as long as you don’t need x86 compatibility.

This last bit highlights one of the enduring frustrations for microprocessor designers. No matter how fast, efficient, or clever your CPU is, you can’t escape simple human inertia. Backward compatibility is a powerful force in the universe, and no amount of engineering can overcome it. Netbook computers were – briefly – seen as a second chance for good processors to overcome Intel’s dominance of the computing market. Instead, they’re just another new niche for the x86, albeit a less profitable one. It turns out that people want PCs that behave like PCs, not $350 Linux notebooks.

When Is an ARM Ahead?

So the Cortex-A9 brings ARM well and truly into the ranks of grown-up processor vendors. Superscalar? Check. Multicore? Check. Out-of-order execution, two-level caches, and coprocessors? Check, check, and check. Even PC-like speed grades are available if you have all the right ingredients in your fabrication line. On the hardware front, ARM has just about everything that the MIPS, PowerPC, and x86 vendors have.

On the software side, ARM’s story is getting better and better. It’s already the de facto architecture for wireless and/or mobile applications, thanks to a few early design wins with European cell phones. Just as the first IBM Personal Computer Model 5150 determined Intel’s fate, those early cell phones cast the die for ARM. Most everything that followed has been beer and skittles. Since then, ARM’s legion of licensees has shipped more 32-bit processors than anyone. Just last year, 4 billion new ARM-based chips went out the door  — that’s 11 million per day, or about 8 times the volume of MIPS-based processors.

Is it fair? Doesn’t matter. That’s how real product engineering works. You build the best product you can and you wait for customers to make their typically uninformed and irrational decisions. For the time being, ARM and x86 dominate much of the landscape, with other architectures flourishing in their own little niches. It’s a good time for designers choosing a processor and a good time for a few lucky processor vendors. With the Cortex-A9, ARM looks poised to continue its vertical takeoff.

Leave a Reply

featured blogs
Apr 25, 2024
Structures in Allegro X layout editors let you create reusable building blocks for your PCBs, saving you time and ensuring consistency. What are Structures? Structures are pre-defined groups of design objects, such as vias, connecting lines (clines), and shapes. You can combi...
Apr 25, 2024
See how the UCIe protocol creates multi-die chips by connecting chiplets from different vendors and nodes, and learn about the role of IP and specifications.The post Want to Mix and Match Dies in a Single Package? UCIe Can Get You There appeared first on Chip Design....
Apr 18, 2024
Are you ready for a revolution in robotic technology (as opposed to a robotic revolution, of course)?...

featured video

MaxLinear Integrates Analog & Digital Design in One Chip with Cadence 3D Solvers

Sponsored by Cadence Design Systems

MaxLinear has the unique capability of integrating analog and digital design on the same chip. Because of this, the team developed some interesting technology in the communication space. In the optical infrastructure domain, they created the first fully integrated 5nm CMOS PAM4 DSP. All their products solve critical communication and high-frequency analysis challenges.

Learn more about how MaxLinear is using Cadence’s Clarity 3D Solver and EMX Planar 3D Solver in their design process.

featured paper

Designing Robust 5G Power Amplifiers for the Real World

Sponsored by Keysight

Simulating 5G power amplifier (PA) designs at the component and system levels with authentic modulation and high-fidelity behavioral models increases predictability, lowers risk, and shrinks schedules. Simulation software enables multi-technology layout and multi-domain analysis, evaluating the impacts of 5G PA design choices while delivering accurate results in a single virtual workspace. This application note delves into how authentic modulation enhances predictability and performance in 5G millimeter-wave systems.

Download now to revolutionize your design process.

featured chalk talk

Littelfuse Protection IC (eFuse)
If you are working on an industrial, consumer, or telecom design, protection ICs can offer a variety of valuable benefits including reverse current protection, over temperature protection, short circuit protection, and a whole lot more. In this episode of Chalk Talk, Amelia Dalton and Pete Pytlik from Littelfuse explore the key features of protection ICs, how protection ICs compare to conventional discrete component solutions, and how you can take advantage of Littelfuse protection ICs in your next design.
May 8, 2023
41,656 views