Cortex-A35 Is More, Better, Faster

“Mom, they did it again!”

How do they do it? The engineers at ARM, I mean. They just keep cranking out new microprocessors, month after month, year after year. And they all look… so much the same.

It’s like Taco Bell: they have just three ingredients but they’re brilliant at mixing them around in different ways to look like new products. It’s impressive, really. And evidently quite profitable.

This week’s 64-bit burrito is ARM’s new Cortex-A35 microprocessor core. The –A35 is expected to replace the existing Cortex-A5 and –A7 in ARM’s low-end lineup of A-class processors. (It has nothing to do with the low-range M-class or midrange R-class designs.) As such, ARM expects the –A35 to power next year’s entry-level smartphones, which the company sees as a growing and attractive market.

Unlike the –A5 and –A7, which are both based on version 7 of ARM’s playbook (ARMv7-A), the new –A35 is a 64-bit design based on the newer ARMv8-A architecture. That means the A35 gets 64-bittedness, a redesigned instruction pipeline, and higher potential clock speeds. A better processor, in other words.

The A35 is also a lot more “efficient” than its predecessors. That word comes up a lot in ARM’s official press material and in any discussion of the –A35’s benefits. The company describes it as “ultra-high efficiency” and their “most efficient processor ever.” That’s swell, but how, exactly, does one measure efficiency? What are the units? My scientific calculator can’t convert watts, joules, ergs, newtons, or horsepower to “efficiencies.”

By way of clarification, ARM says they’re measuring “performance over power consumption: milliwatts.” Okay, then. It’s a simple ratio of one unknown number over another unknown number.

Performance doing what? A benchmark program? A collection of benchmarks? Whose benchmarks, run under what circumstances? And those milliwatts – are they measured while running those same benchmarks, or while doing something else entirely? And are we comparing the same test(s) on both the old Cortex-A7 and the new –A35, in which case, we’re dealing with two ratios of four unknown numbers? My calculator can’t do that, either.

We can glean a little wisdom from the quantifiable information available, however. The –A35 apparently uses 10% less power than its –A7 predecessor, due to its “more efficient” design. That assumes both are built in the same semiconductor process and run at the same clock frequency. Interestingly, another 10% reduction can be had simply by recompiling your benchmark code using better compilers and redesigning your core layout with better EDA tools, something ARM terms “flow improvements.” All of which suggests that current –A7 users might be leaving some optimizations on the table.

Integer performance improves by a scant 6% over the –A7, which isn’t too surprising since both support the same instruction set sluicing through similar 8-stage pipelines. Some tweaks to branch prediction and a redesigned instruction-fetch stage are the culprits. The –A35, like its predecessors, does only very limited dual issue; anything more aggressive would have sacrificed “efficiency” – and moved it up the company’s product value chain.

Browser performance gets a 16% bump, mostly due to a bigger TLB, and floating-point math improves by one-third, a significant jump, thanks to a redesigned Neon unit. Double-precision FP ops are the biggest beneficiaries. ARM’s best result of all was a 40% boost in Geekbench scores.

ARM also says that the –A35 consumes 32% less power than its big sister, the –A53 (dyslexia alert!), all other things being equal. That’s remarkable, since they’re both based on the same underlying v8-A architecture and their features sets are about the same. Like the –A53, the –A35 can be configured with one, two, or four identical CPU cores working together in a single cluster.

The –A35’s role in the world is to replace last year’s models, the –A5 and –A7. There’s a bit of a performance bump over the v7 designs and, apparently, a bit of an efficiency improvement as well. And the new core can run 64-bit code, so there’s that. But overall, you’re looking at a tweaked –A7, something that sits at the bottom end of ARM’s mainstream product line. An upgrade, if you like. But one that comes with a new-CPU price tag.

ARM will continue to license the Cortex-A5 and –A7; there’s no reason to ever refuse to sell them. (You can still buy a license to the ancient ARM7TDMI if you really want to.) But their prices will tumble as the –A35 takes their place. So if you’re an existing user of an –A7, for example, you have to ask yourself if the incremental improvements to the –A35 are worth the price of admission. ARM never publishes its price list, but it’s a safe bet that a new –A35 license will run well into six figures, plus a new royalty agreement. Is a 10% reduction in power consumption (versus the –A7) worth that? Or the 16% improvement in browser performance? To the extent that Geekbench approximates your own workload, its 40% thumping of a same-speed –A7 might make it interesting.

Product-line management is a wonderful thing. Automakers, clothing designers, consumer-electronics firms, and even fast-food restaurants all adjust their offerings to provide an attractive option at every conceivable price level. There’s a real art to making a few basic ingredients look like a full menu. And Cambridge HQ has got that recipe nailed.