feature article
Subscribe Now

ARM’s Race Escalates with Cortex-A9

In military parlance, an Osprey is a propeller-driven airplane that takes off and lands vertically, like a helicopter. The Osprey tilts its wings 90 degrees, the props pull it straight up, and the wings flip back again for conventional flight. Clever engineering, but a bit ungainly to look at.

Over in the less dangerous but equally contentious microprocessor world, ARM has also hatched its own Osprey, this one officially named the ARM Cortex-A9. The new A9 will be capable of 2-GHz clock rates, an unheard-of speed for an ARM core. The previous-generation A8 was barely able to make 1 GHz (see Embedded Technology Journal, July 28, 2009, “Better, Stronger, Faster), and even that required some silicon sleight of hand from Intrinsity. At 2 GHz, the new A9 becomes the most potent weapon from the ARM’s dealer.

How did ARM achieve that speed? You might say it was by fixing the A8, but that would be ungracious. A more charitable analysis shows that the A9 actually has a shorter pipeline than the A8, usually the antithesis of high-speed efficiency. But the A8’s pipeline was deliberately long to tolerate implementation-specific tweaks. It also allowed for some margin in ARM’s first-ever superscalar core. Now that the company (and its licensees) has some experience fabricating superscalar ARM cores, it could fine-tune the logic paths, trim some margin, upgrade the core, and still produce a faster version that’s also more capable.

Improvements include a better floating-point unit and out-or-order execution. The A8 was in-order only, even though it was dual-issue superscalar. The A9 also comes in one-, two-, and four-core variations; the customer can choose the number of cores at implementation time. Although ARM doesn’t officially see it that way, the A8 now looks like a bit of a tentative first step into the world of superscalar Cortex implementations. The company expects that some customers will still license the A8 instead of the A9, probably opening a pricing gap between them.

A Soft Pitch and a Hard Core

Like most ARM cores, the A9 will be delivered either as “soft IP” or as a tuned hard macro. The company says it’s signed 15 licensees so far, even though the final version of the IP won’t be available until late this year. The list includes the usual suspects: Texas Instruments, Ericsson, nVidia, NXP, and Toshiba, as well as a gaggle of anonymous customers. Expect the A9 to appear first in mobile devices (a traditional ARM stronghold), then in automotive “infotainment” systems and living-room electronics such as HDTVs. This last product category is traditionally MIPS territory, so the A9 is a clear shot across the rival RISC vendor’s bow.

Like most of the recent high-end ARM processors, the A9 comes with an assortment of coprocessors baked right in. Jazelle acts as a Java accelerator; TrustZone aids security; Neon speeds up graphics. These accelerators are always part of the microprocessor core and can’t be removed to save either space or power. There’s little reason to: they account for just a small fraction of the silicon real estate (much smaller than the caches, for example), and consume next to no power when they’re not being used. Your software is free to ignore them, of course.

ARM Flexes Its Bicep

So how fast is the new A9? Um, pretty fast. Benchmarking a naked processor core, sans memory and I/O, is a speculative endeavor at best. ARM claims 10,000 Dhrystone MIPS, which is a bit like saying your jet fighter can do four million furlongs per fortnight. It might be accurate for some set of circumstances but it’s still largely meaningless.

The A9 should hit high triple digits with no problem in a modern 45nm process optimized for power efficiency. With all the knobs turned the other way, 2 GHz is doable, although power consumption will quadruple, from insignificant to merely tiny. 

For what it’s worth, an x86 processor like Intel’s Atom N270 delivers about 2000 Dhrystone MIPS at 1.6 GHz, so ARM is claiming something like 5x the performance at a 25% faster clock frequency. Again, that’s a specific implementation (in this case, a low-cost netbook computer) versus a disembodied processor core… but the difference is stark nonetheless. And it’s a given that the A9 will consume less power, no matter how it’s fabricated.

Using the more reliable EEMBC CoreMark test, the A8 has about a 2x advantage over the Atom N270. In other words, it delivers about the same performance at half the clock speed. And one-fifth the power. Not a bad tradeoff, as long as you don’t need x86 compatibility.

This last bit highlights one of the enduring frustrations for microprocessor designers. No matter how fast, efficient, or clever your CPU is, you can’t escape simple human inertia. Backward compatibility is a powerful force in the universe, and no amount of engineering can overcome it. Netbook computers were – briefly – seen as a second chance for good processors to overcome Intel’s dominance of the computing market. Instead, they’re just another new niche for the x86, albeit a less profitable one. It turns out that people want PCs that behave like PCs, not $350 Linux notebooks.

When Is an ARM Ahead?

So the Cortex-A9 brings ARM well and truly into the ranks of grown-up processor vendors. Superscalar? Check. Multicore? Check. Out-of-order execution, two-level caches, and coprocessors? Check, check, and check. Even PC-like speed grades are available if you have all the right ingredients in your fabrication line. On the hardware front, ARM has just about everything that the MIPS, PowerPC, and x86 vendors have.

On the software side, ARM’s story is getting better and better. It’s already the de facto architecture for wireless and/or mobile applications, thanks to a few early design wins with European cell phones. Just as the first IBM Personal Computer Model 5150 determined Intel’s fate, those early cell phones cast the die for ARM. Most everything that followed has been beer and skittles. Since then, ARM’s legion of licensees has shipped more 32-bit processors than anyone. Just last year, 4 billion new ARM-based chips went out the door  — that’s 11 million per day, or about 8 times the volume of MIPS-based processors.

Is it fair? Doesn’t matter. That’s how real product engineering works. You build the best product you can and you wait for customers to make their typically uninformed and irrational decisions. For the time being, ARM and x86 dominate much of the landscape, with other architectures flourishing in their own little niches. It’s a good time for designers choosing a processor and a good time for a few lucky processor vendors. With the Cortex-A9, ARM looks poised to continue its vertical takeoff.

Leave a Reply

featured blogs
Jun 9, 2023
In this Knowledge Booster blog, let us talk about the simulation of the circuits based on switched capacitors and capacitance-to-voltage (C2V) converters using various analyses available under the Shooting Newton method using Spectre RF. The videos described in this blog are ...
Jun 8, 2023
Learn how our EDA tools accelerate 5G SoC design for customer Viettel, who designs chips for 5G base stations and drives 5G rollout across Vietnam. The post Customer Spotlight: Viettel Accelerates Design of Its First 5G SoC with Synopsys ASIP Designer appeared first on New H...
Jun 2, 2023
I just heard something that really gave me pause for thought -- the fact that everyone experiences two forms of death (given a choice, I'd rather not experience even one)....

featured video

Efficient Top-Level Interconnect Planning and Implementation with Synopsys IC Compiler II

Sponsored by Synopsys

This video shows how IC Compiler II and Fusion Compiler enable intelligent planning and implementation of complex interconnects through innovative Topological Interconnect Planning technology - accelerating schedules and achieving highest QoR.

Learn More

featured paper

EC Solver Tech Brief

Sponsored by Cadence Design Systems

The Cadence® Celsius™ EC Solver supports electronics system designers in managing the most challenging thermal/electronic cooling problems quickly and accurately. By utilizing a powerful computational engine and meshing technology, designers can model and analyze the fluid flow and heat transfer of even the most complex electronic system and ensure the electronic cooling system is reliable.

Click to read more

featured chalk talk

Chipageddon: What's Happening, Why It's Happening and When Will It End
Sponsored by Mouser Electronics and Digi
Semiconductors are an integral part of our design lives, but supply chain issues continue to upset our design processes. In this episode of Chalk Talk, Ronald Singh from Digi and Amelia Dalton investigate the variety of reasons behind today’s semiconductor supply chain woes. They also take a closer look at how a system-on-module approach could help alleviate some of these issues and how you can navigate these challenges for your next design.
Jul 13, 2022
39,026 views