feature article
Subscribe Now

ARM’s Race Escalates with Cortex-A9

In military parlance, an Osprey is a propeller-driven airplane that takes off and lands vertically, like a helicopter. The Osprey tilts its wings 90 degrees, the props pull it straight up, and the wings flip back again for conventional flight. Clever engineering, but a bit ungainly to look at.

Over in the less dangerous but equally contentious microprocessor world, ARM has also hatched its own Osprey, this one officially named the ARM Cortex-A9. The new A9 will be capable of 2-GHz clock rates, an unheard-of speed for an ARM core. The previous-generation A8 was barely able to make 1 GHz (see Embedded Technology Journal, July 28, 2009, “Better, Stronger, Faster), and even that required some silicon sleight of hand from Intrinsity. At 2 GHz, the new A9 becomes the most potent weapon from the ARM’s dealer.

How did ARM achieve that speed? You might say it was by fixing the A8, but that would be ungracious. A more charitable analysis shows that the A9 actually has a shorter pipeline than the A8, usually the antithesis of high-speed efficiency. But the A8’s pipeline was deliberately long to tolerate implementation-specific tweaks. It also allowed for some margin in ARM’s first-ever superscalar core. Now that the company (and its licensees) has some experience fabricating superscalar ARM cores, it could fine-tune the logic paths, trim some margin, upgrade the core, and still produce a faster version that’s also more capable.

Improvements include a better floating-point unit and out-or-order execution. The A8 was in-order only, even though it was dual-issue superscalar. The A9 also comes in one-, two-, and four-core variations; the customer can choose the number of cores at implementation time. Although ARM doesn’t officially see it that way, the A8 now looks like a bit of a tentative first step into the world of superscalar Cortex implementations. The company expects that some customers will still license the A8 instead of the A9, probably opening a pricing gap between them.

A Soft Pitch and a Hard Core

Like most ARM cores, the A9 will be delivered either as “soft IP” or as a tuned hard macro. The company says it’s signed 15 licensees so far, even though the final version of the IP won’t be available until late this year. The list includes the usual suspects: Texas Instruments, Ericsson, nVidia, NXP, and Toshiba, as well as a gaggle of anonymous customers. Expect the A9 to appear first in mobile devices (a traditional ARM stronghold), then in automotive “infotainment” systems and living-room electronics such as HDTVs. This last product category is traditionally MIPS territory, so the A9 is a clear shot across the rival RISC vendor’s bow.

Like most of the recent high-end ARM processors, the A9 comes with an assortment of coprocessors baked right in. Jazelle acts as a Java accelerator; TrustZone aids security; Neon speeds up graphics. These accelerators are always part of the microprocessor core and can’t be removed to save either space or power. There’s little reason to: they account for just a small fraction of the silicon real estate (much smaller than the caches, for example), and consume next to no power when they’re not being used. Your software is free to ignore them, of course.

ARM Flexes Its Bicep

So how fast is the new A9? Um, pretty fast. Benchmarking a naked processor core, sans memory and I/O, is a speculative endeavor at best. ARM claims 10,000 Dhrystone MIPS, which is a bit like saying your jet fighter can do four million furlongs per fortnight. It might be accurate for some set of circumstances but it’s still largely meaningless.

The A9 should hit high triple digits with no problem in a modern 45nm process optimized for power efficiency. With all the knobs turned the other way, 2 GHz is doable, although power consumption will quadruple, from insignificant to merely tiny. 

For what it’s worth, an x86 processor like Intel’s Atom N270 delivers about 2000 Dhrystone MIPS at 1.6 GHz, so ARM is claiming something like 5x the performance at a 25% faster clock frequency. Again, that’s a specific implementation (in this case, a low-cost netbook computer) versus a disembodied processor core… but the difference is stark nonetheless. And it’s a given that the A9 will consume less power, no matter how it’s fabricated.

Using the more reliable EEMBC CoreMark test, the A8 has about a 2x advantage over the Atom N270. In other words, it delivers about the same performance at half the clock speed. And one-fifth the power. Not a bad tradeoff, as long as you don’t need x86 compatibility.

This last bit highlights one of the enduring frustrations for microprocessor designers. No matter how fast, efficient, or clever your CPU is, you can’t escape simple human inertia. Backward compatibility is a powerful force in the universe, and no amount of engineering can overcome it. Netbook computers were – briefly – seen as a second chance for good processors to overcome Intel’s dominance of the computing market. Instead, they’re just another new niche for the x86, albeit a less profitable one. It turns out that people want PCs that behave like PCs, not $350 Linux notebooks.

When Is an ARM Ahead?

So the Cortex-A9 brings ARM well and truly into the ranks of grown-up processor vendors. Superscalar? Check. Multicore? Check. Out-of-order execution, two-level caches, and coprocessors? Check, check, and check. Even PC-like speed grades are available if you have all the right ingredients in your fabrication line. On the hardware front, ARM has just about everything that the MIPS, PowerPC, and x86 vendors have.

On the software side, ARM’s story is getting better and better. It’s already the de facto architecture for wireless and/or mobile applications, thanks to a few early design wins with European cell phones. Just as the first IBM Personal Computer Model 5150 determined Intel’s fate, those early cell phones cast the die for ARM. Most everything that followed has been beer and skittles. Since then, ARM’s legion of licensees has shipped more 32-bit processors than anyone. Just last year, 4 billion new ARM-based chips went out the door  — that’s 11 million per day, or about 8 times the volume of MIPS-based processors.

Is it fair? Doesn’t matter. That’s how real product engineering works. You build the best product you can and you wait for customers to make their typically uninformed and irrational decisions. For the time being, ARM and x86 dominate much of the landscape, with other architectures flourishing in their own little niches. It’s a good time for designers choosing a processor and a good time for a few lucky processor vendors. With the Cortex-A9, ARM looks poised to continue its vertical takeoff.

Leave a Reply

featured blogs
Sep 19, 2023
What's new with the latest Bluetooth mesh specification? Explore mesh 1.1 features that improve security and network efficiency, reduce power, and more....
Sep 20, 2023
Qualcomm FastConnect Software Suite for XR empowers OEMs with system-level optimizations for truly wireless XR....
Sep 20, 2023
The newest version of Fine Marine offers critical enhancements that improve solver performances and sharpen the C-Wizard's capabilities even further. Check out the highlights: γ-ReθTransition Model and Extension for Crossflow Modeling We have boosted our modeling capabi...
Sep 20, 2023
ESD protection analysis is a critical step in the IC design process; see how our full-chip PrimeESD tool accelerates ESD simulation and violation reporting.The post New Unified Electrostatic Reliability Analysis Solution Has Your Chip Covered appeared first on Chip Design...
Sep 10, 2023
A young girl's autobiography describing growing up alongside the creation of the state of Israel...

Featured Video

Chiplet Architecture Accelerates Delivery of Industry-Leading Intel® FPGA Features and Capabilities

Sponsored by Intel

With each generation, packing millions of transistors onto shrinking dies gets more challenging. But we are continuing to change the game with advanced, targeted FPGAs for your needs. In this video, you’ll discover how Intel®’s chiplet-based approach to FPGAs delivers the latest capabilities faster than ever. Find out how we deliver on the promise of Moore’s law and push the boundaries with future innovations such as pathfinding options for chip-to-chip optical communication, exploring new ways to deliver better AI, and adopting UCIe standards in our next-generation FPGAs.

To learn more about chiplet architecture in Intel FPGA devices visit https://intel.ly/45B65Ij

featured paper

Intel's Chiplet Leadership Delivers Industry-Leading Capabilities at an Accelerated Pace

Sponsored by Intel

We're proud of our long history of rapid innovation in #FPGA development. With the help of Intel's Embedded Multi-Die Interconnect Bridge (EMIB), we’ve been able to advance our FPGAs at breakneck speed. In this blog, Intel’s Deepali Trehan charts the incredible history of our chiplet technology advancement from 2011 to today, and the many advantages of Intel's programmable logic devices, including the flexibility to combine a variety of IP from different process nodes and foundries, quicker time-to-market for new technologies and the ability to build higher-capacity semiconductors

To learn more about chiplet architecture in Intel FPGA devices visit: https://intel.ly/47JKL5h

featured chalk talk

Johnson RF Connectivity Solutions
The growing need for remote patient monitoring and wireless connectivity has made RF in medicine applications more important than ever before. In this episode of Chalk Talk, Amelia Dalton chats with Ketan Thakkar from Cinch Connectivity Solutions about the growing trends in medicine today that are encouraging the use of RF, why higher frequency, smaller form factor, cable assembly expansion and adapter expansion are vital components in today’s medical applications and why Johnson medical solutions could be a great fit for your next medical design.
Nov 28, 2022
34,832 views