feature article
Subscribe Now

Ampere Ups the ARM Ante

Company Tees Up Next Generation of ARM-Based Server Chips

They say there’s no such thing as “the cloud.” It’s just somebody else’s computer. That’s true, but it doesn’t mean that their computer is the same as your computer. Today, most cloud datacenter servers are x86 machines just like your desktop PC except bigger and farther away. But that doesn’t have to be the case. 

Silicon Valley company Ampere Computing thinks that cloud datacenters really should be different from remote PCs, starting with the processor and its instruction set. And today, the company started to lift the veil on its plans to make that happen. 

Ampere’s first-generation Altra processor is already in the market and has been “shipping for revenue since last year,” according to Chief Product Officer Jeff Wittich. It’s about to be joined by the upgraded Altra Max chip, which should enter production in Q3 of this year. Both chips are based on ARM’s Neoverse N1 design running at 3 GHz in TSMC’s 7nm N7 process. 

But Altra and Max are just the warm-up act before Ampere’s second generation of processors debut, possibly by next year. The as-yet-unnamed devices will be based on an entirely new ARM core design that Ampere is designing in-house instead of borrowing from Neoverse. Like Apple and a small handful of other companies, Ampere has been quietly designing its own custom ARM implementations. 

Details are few regarding the new generation, except that it’ll be fabricated in TSMC’s 5nm N5 process and have more than 128 cores and faster memory and I/O compared to Altra, but it will remain fully ARM-compatible. The company isn’t saying if it will base the chip on the recently announced ARMv9 architecture specification. “It’s more nuanced than that,” hints Wittich. 

Ampere is able to design its own ARM-compatible CPU cores thanks to a rare (and expensive) ARM architectural license that it acquired indirectly from AppliedMicro and that company’s X-Gene project. “This is what we’ve been working on for the past three and half years,” says Wittich, pegging the start of CPU development with the founding of the company. In other words, this was their plan all along. 

Having an in-house processor gives Ampere “a more rapid annual cadence” of product introductions than it could have by waiting for ARM’s official rollouts. Ampere says it will add security features to its new core, along with new elements for manageability, telemetry, and resiliency – all things server operators want to see. 

In the meantime, existing Altra customers can look forward to Altra Max later this year. Altra Max ups the core count to 128 (from Altra’s 80). That’s over 50% more processor goodness in the same pin-compatible package. Both run at a solid 3.0 GHz, with no “turbo mode” or variable clock scaling like you’d see on a server-class x86 chip such as Intel’s Xeon or AMD’s Epyc processors. That’s deliberate, and part of what makes Altra different. 

Ampere believes that cloud server workloads are fundamentally different from client PC workloads, starting with the clocking. Servers are shared, and one processor core’s clock frequency shouldn’t affect that of its neighbors. Conventional x86 chips throttle clock speed to remain within a defined thermal envelope, which means a high-demand task running on one CPU core might force a slowdown of the other 31 cores in the same chip. Intel and AMD euphemistically refer to this as turbo mode because it sounds better than don’t-melt-the-chip mode. 

Altra and Altra Max, in contrast, run at a consistent clock rate all the time. In a sense, they’re always in turbo mode and the company says there’s no combination of workloads that will overheat the chips or force a slowdown. Predictability is preserved. 

Ampere’s chips also don’t implement hyperthreading. They’re all single-threaded CPU cores, so the number of cores equals the number of execution threads. That, too, is a nod toward independence and determinism. Server tasks are often broken down into microservices, where multithreading isn’t helpful. It’s more important, says Ampere, that tasks don’t compete for hardware resources or interfere with each other. 

That strategy plays out in the chips’ cache organization, too. Altra and Max both have large L1 and L2 caches, with a comparatively small L3. The last-level cache would be shared among CPU cores (and thus, among tasks), which doesn’t suit the multi-tenancy model of servers. 

The bottom line is that performance scales almost linearly with core count – assuming, of course, that you’re running single-threaded microservices that don’t interact with one another. Ampere hasn’t suddenly found a magical solution to multiprocessor load balancing problems; the company simply focuses its efforts on a subset of tasks that suit its target market. And its chip architecture. 

Wittich points out that users can reduce the processor’s clock frequency if they want to save power, but they never have to. Altra Max operates within the same physical, electrical, and thermal envelopes as Altra, despite having 48 additional CPU cores. At full speed, Altra Max delivers more performance than an x86 processor, or, with the voltage and frequency turned down, it can deliver the same performance for less energy. 

That performance-per-watt ratio has driven a lot of ARM-based server projects… right into the ground. It’s a compelling technical challenge and an attractive market. Who wouldn’t want 1% or 5% of Intel’s lucrative server-processor business? And yet, the failures outnumber the successes by a large irrational number. Ampere may be shipping Altra chips for revenue, but it’s not shipping a whole lot of them for revenue. Ampere’s big-name partners – Microsoft, Oracle, CloudFlare – seem to be kicking the tires, not backing up forklifts loaded with Altra chips. Only one customer, Equinix, has Altra-based servers online and ready for the average Joe to use. But hey, you gotta start somewhere.  

The market for PC processors started out with one or two dominant vendors, and then it had a brief period with a lot of startup competitors, then went back to one or two dominant vendors. Maybe Ampere is right. Maybe the cloud server market really will be different.

Leave a Reply

featured blogs
Sep 26, 2021
https://youtu.be/Ivi2dTIcm9E Made at my garden gate (camera Carey Guo) Monday: Ten Lessons from Three Generations of Google TPUs Tuesday: At a Digital Crossroads Wednesday: Announcing Helium, Hybrid... [[ Click on the title to access the full blog on the Cadence Community si...
Sep 24, 2021
Wi-Fi, NB-IoT, Bluetooth, LoRaWAN... This webinar will help you to choose the appropriate connectivity protocol for your IoT application....
Sep 23, 2021
The GIRLS GO Engineering scholarship provides opportunities for women in tech and fosters diversity in STEM; see the winners of our 2021 engineering challenge! The post GIRLS GO Engineering! Empowers Our Next-Gen Women in Tech appeared first on From Silicon To Software....
Sep 23, 2021
The Global Environment Facility Small Grants Programme (GEF SGP), implemented by the United Nations Development Programme, is collaborating with the InnovateFPGA contest. Showcase your  skills with Intel Edge-Centric FPGAs and help develop technical solutions that reduce env...

featured video

ARC® Processor Virtual Summit 2021

Sponsored by Synopsys

Designing an embedded SoC? Attend the ARC Processor Virtual Summit on Sept 21-22 to get in-depth information from industry leaders on the latest ARC processor IP and related hardware and software technologies that enable you to achieve differentiation in your chip or system design.

Click to read more

featured paper

Detect. Sense. Control: Simplify building automation designs with MSP430™ MCU-based solutions

Sponsored by Texas Instruments

Building automation systems are critical not only to security, but worker comfort. Whether you need to detect, sense or control applications within your environment, the right MCU can make it easy. Using MSP430 MCUS with integrated analog, you can easily develop common building automation applications including motion detectors, touch keypads and e-locks, as well as video security cameras. Read more to see how you can enhance your building automation design.

Click to read more

featured chalk talk

SN1000 SmartNIC

Sponsored by Xilinx

Cloud providers face a variety of challenges with moving data from one place to another. In modern data centers, flexibility is a key consideration - on par with performance. Software-defined hardware acceleration offers a major breakthrough in flexibility. In this episode of Chalk Talk, Amelia Dalton chats with Kartik Srinivasan of Xilinx about the details of Smart NICs with the new Alveo SN1000 with composable hardware.

Click here for more information about the Alveo SN1000 - The Composable SmartNIC