feature article
Subscribe Now

Low-Power Servers: Opportunity or Oxymoron?

New ARM-Based Server Chips Change the Data-Warehouse Landscape

You knew it had to be about ARM. Everything today is about ARM. So it’s no big surprise that ARM is elbowing its way into the formerly sacrosanct halls of server farms. You know, those big echoing hallways filled with racks upon racks of server blades, all humming along as they power the Matrix—er, the Internet.

Server racks have traditionally been the domain of big, burly, he-man microprocessors from the likes of Intel, Sun, or IBM. Only the beefiest, most power-mad processors need apply. “This is man’s work, sonny, and no little sprouts are welcome here. See what we’re doing here? We’re building the future! Why, we eat Google searches for lunch! Step aside as I process another million Amazon transactions.”

Yeah, well, good luck with that. Better check your rear-view mirrors, guys, because there’s an upstart about to run right up your tailpipe. There’s a 90-pound weakling in your midst and he’s fixin’ to kick sand in everybody’s face.

It’s finely refined sand, mind you. Silicon, in fact. And it’s about to get in the eyes of the big-iron server-chip vendors.

It’s no secret that ARM-based chips have pretty much taken over the world of cell phones and, by extension, tablets and other not-really-a-computer gadgets. ARM chips have a reputation for being stingy with power (partially true) and inexpensive (also partially true). But they’ve never been viewed as… you know… especially powerful. They’re fine for toys and games, but you wouldn’t want your server using one.

Servers require ultimate performance. Servers require massive data caches. Servers require big power cables, blinking LEDs, and loud fans. Running a server on an ARM-based chip would be like towing a Boeing 747 with hummingbirds. Lots and lots of hummingbirds.

Server, meet hummingbird. Two sane and reputable chip companies (plus a number of lesser-known startups) are readying production of ARM-based processor chips aimed specifically at servers. And not weenie little home servers, either. We’re talking big iron: server blades for Amazon, Google, Yahoo!, Netflix, and the other titans of the Internet. Bandwidth from the bandy-legged. Who woulda thunk it?

First among these upstarts is Calxeda, a Texas-based company that has designed its “EnergyCore” processor family around the familiar ARM Cortex-A9. Or, more accurately, multiple Cortex-A9s. Calxeda’s initial chip will have four A9s on it, all running a little north of 1 GHz apiece. That’s nice, but is it really server-level performance? After all, the cute little iPad 2 runs on a dual-core A9, and nobody’s wiring those up to optical backbones.

Hang on; it gets better. EnergyCore chips also have five (count ’em) 10-Gbit XAUI interfaces on them (that’s network-speak for fast pipe). And SGMII. And PCI Express. And SATA. The list goes on, but you get the idea: it’s a big network-I/O chip with some processors inside. The bandwidth of those network interfaces easily exceeds the processing power of the four A9s, but that’s okay, because not all network traffic will pass through the processors. And that brings us to the entire point of Calxeda’s strategy.

EnergyCore chips are intended to be installed in clusters, like a mesh. Clustered together, you get an awesome amount of network bandwidth and a semi-awesome amount of computing power, all through relatively low-cost chips that burn only a few watts apiece. You can add or subtract chips to get the price/performance level you want, all without gutting and replacing your expensive server hardware. It’s very scalable, which server buyers really like.

Over on the West Coast, Applied Micro (APM) is working on a similar strategy. APM hasn’t revealed the details if its chips yet, except to say that they’ll be based on the newly announced 64-bit ARM v8 architecture (see “When I’m Sixty-Four” at https://www.eejournal.com/archives/articles/20111102-64bit/). They’ll also run at faster clock rates, at about 3 GHz, and have up to 32 cores per chip. That should certainly position APM’s devices higher up the performance ladder than Calxeda’s.

How do ARM vendors get off thinking they can design server chips? Aren’t those little cell phone processors woefully underpowered? Well, yes and no. Turns out, most big server traffic is just that: traffic. It’s more important to move the data from Point A to Point B than it is to massage it along the way. Server processors don’t do all that much processing in the usual sense; that’s why we have dozens of companies making specialized communications chips. What processing they do perform is typically done in short bursts, on transient data, to a lot of unrelated packets at once. In that kind of environment, a cluster of independent processors can work just as well as (or even better than) one big processor. Big processors like Intel’s Xeon and Itanium were designed (initially, at least) for big computing problems. Network servers aren’t like that. They’re the canonical example of parallel multitasking, so parallel multicore chips do a pretty good job of it. And if you factor in the energy consumed per unit of work performed, they’re actually far more efficient.

That’s got Intel, Sun (now Oracle), and the PowerPC folks a bit worried. Servers were traditionally the preserve of “big iron” processors that pushed the limits of performance. Server processors were the company flagship and also a nice profit center. Intel’s server chips, for example, are astonishingly lucrative, considering they’re based on the same basic CPU architecture as the company’s embedded Atom processors.

So if ARM’s advantage is mainly power efficiency, why can’t Intel simply dial down the power of its Xeon chips and undermine the business strategy of Calxeda and APM? They’re trying. They’re trying very hard, in fact.

Intel’s problem—if you can call it that—is that it’s too popular. Its x86 chips have developed a huge installed base and an enormous third-party software ecosystem. That popularity has cemented the x86 architecture in place; Intel can’t change it. And the architecture needs to change if it is to compete effectively against ARM-based (or any RISC-based) intruders.

It’s not as though Intel’s chip designers aren’t smart enough; they are. I know some of them personally and they’re the smartest CPU architects and circuit designers in the room, if not the entire galaxy. It’s just that Intel’s whole value proposition is based on its microprocessors being x86 compatible, and you can’t be “partially compatible.” The designers are duty-bound to implement the entire x86 instruction set, warts and all, including oddball features accumulated over 40 years of development. Granted, some little-used instructions can be implemented in firmware or through emulation if the designers want to, but they absolutely, positively have to be in there in some form. You can’t jettison all that compatibility and RISC-ify the x86 and still call it an x86 chip. For good or ill, these guys are saddled with the oldest, strangest, and least orthogonal CPU architecture in the world. That they make it work at all is a testament to how smart they really are.

Intel has brand recognition and customer momentum in its favor. It also has the world’s best semiconductor manufacturing technology under its own roof. The ARM vendors have the advantage of a newer architecture but are handicapped slightly by less-advanced silicon production. They’re also starting from a position of weakness, having to build up relationships with new customers and displace the incumbent Intel (or IBM, et al).

Server users (meaning you and me) don’t really care what processor is inside a remote server blade. Server buyers (e.g., Amazon) do care; they’re paying for the hardware and they’re paying for the power and air conditioning to keep the hardware happy. They’re the ones who will decide which server design style wins. Tried and true, or new and improved? I think the old-school server chip vendors need to keep an eye on the rear view mirror. They’re about to be overtaken. 

Leave a Reply

featured blogs
Jun 9, 2023
In this Knowledge Booster blog, let us talk about the simulation of the circuits based on switched capacitors and capacitance-to-voltage (C2V) converters using various analyses available under the Shooting Newton method using Spectre RF. The videos described in this blog are ...
Jun 8, 2023
Learn how our EDA tools accelerate 5G SoC design for customer Viettel, who designs chips for 5G base stations and drives 5G rollout across Vietnam. The post Customer Spotlight: Viettel Accelerates Design of Its First 5G SoC with Synopsys ASIP Designer appeared first on New H...
Jun 2, 2023
I just heard something that really gave me pause for thought -- the fact that everyone experiences two forms of death (given a choice, I'd rather not experience even one)....

featured video

The Role of Artificial Intelligence and Machine Learning in Electronic Design

Sponsored by Cadence Design Systems

In this video, we talk to Paul Cunningham, Senior VP and GM at Cadence, about the transformative role of artificial intelligence and machine learning (AI/ML) in electronic designs. We discuss the transformative period we are experiencing with AI and ML and how Cadence is revolutionizing how we design and verify chips through “computationalizing intuition” and building intuitive systems that learn and adapt to the world around them. With human lives at stake, reliability, and safety are paramount.

Learn More

featured paper

EC Solver Tech Brief

Sponsored by Cadence Design Systems

The Cadence® Celsius™ EC Solver supports electronics system designers in managing the most challenging thermal/electronic cooling problems quickly and accurately. By utilizing a powerful computational engine and meshing technology, designers can model and analyze the fluid flow and heat transfer of even the most complex electronic system and ensure the electronic cooling system is reliable.

Click to read more

featured chalk talk

Automated Benchmark Tuning
Sponsored by Synopsys
Benchmarking is a great way to measure the performance of computing resources, but benchmark tuning can be a very complicated problem to solve. In this episode of Chalk Talk, Nozar Nozarian from Synopsys and Amelia Dalton investigate Synopsys’ Optimizer Studio that combines an evolution search algorithm with a powerful user interface that can help you quickly setup and run benchmarking experiments with much less effort and time than ever before.
Jan 26, 2023
17,733 views