There has been rampant speculation this week on rumors that Intel is in negotiations to buy Altera – in a deal that should be worth over ten billion, and which would be the largest acquisition in Intel’s history. While neither company is saying anything public yet, there is a substantial amount of information available from which to evaluate the potential impact of such a move and to speculate about the reasons behind it.
We actually predicted this eight months ago in our aptly-named article “When Intel Buys Altera” (subtle title, no?), and the arguments we made back then still apply today. But, with almost another year of progress under our collective belts, we should be able to raise the resolution on our crystal ball considerably. While there has been a considerable amount of press and analyst attention on these rumors, we think the analysts are largely off base. We’ll go into the problems with the analyst theories separately, but, for now, here is our take:
Also – please note that there has not been any deal announced as of this writing. We are speculating here – caveat emptor.
The Cold War of Computing
Intel, as everyone knows, has been dominant in processors for personal computing and data centers for decades. In recent years, however, the personal computing market has gone flat, and margins and profits from personal computing have sagged. Increasingly, Intel has relied on high-end data center processors to deliver the margins and profits that the world’s largest semiconductor company requires to stay alive and relevant.
But Intel is by no means the leading supplier of processors. If we compare the number of processors deployed worldwide, Intel’s contribution is dwarfed by ARM. With the explosion of mobile computing, ARM quickly took over the title of most prolific processor architecture – with billions installed each year worldwide. Intel has tried (and failed) several times to get into the mobile game. Despite noble efforts like the Atom processor family, the company has been unable to get the kind of traction required to make even a tiny dent in ARM’s mobile dominance. That left the processor world in a kind of unspoken treaty – detente, if you will – with Intel staying on the high-end side of the neutral zone, and ARM focusing on super low-power mobile computing.
The New Battleground – The Data Center
For years, data center dominance was about cubic MIPS. Heavy-iron processors that delivered the maximum amount of compute performance ruled the roost – price and power be damned. That fact is what has kept Intel’s margins and profits afloat. And the global shift of the computation load to the cloud spawned a glut of gigahertz, and vast new server farms filled their hungry racks with Intel’s cache-laden cash cows.
But then, a disturbing market and technology trend rocked the stable tranquility of our processor treaty. Data center priorities changed. The biggest cost in data centers was power, and so energy consumption (not processor speed) became the hot commodity in server hardware. It is estimated that data centers now consume between one and ten percent of the world’s electricity. And that number is projected to grow significantly. When building a new data center today, the highest priority is access to cheap power – resulting in many huge installations being located near hydroelectric dams and other facilities where massive amounts of inexpensive electricity and cooling are available.
ARM was quick to notice this trend, and they reacted by boldly crossing over into the neutral zone. The company created a series of high-performance, data-center-grade processor architectures that take full advantage of ARM’s arsenal of low-power computing technology. The resulting products pose a serious threat to Intel in the new generation of low-power data center processing.
ARM, of course, doesn’t have a fab to support, and it doesn’t have to hold margins on chips. That problem is left to a frenzied mob of ARM licensees – all of whom are more than willing to swap margins for market share in desperate hopes of clawing their way into some fuzzy future lucrative position. The result could be a whole series of ARM-powered options pushing and shoving their way into big RFQs for server silicon.
Intel has, so far, successfully defended their turf against the ARM-based microservers. They’ve done so with a “cover all bases” approach: extending multi-core Xeon devices up to 18 cores; enabling Atom-based microservers with very ARM-like power/performance numbers; and creating the Xeon Phi co-processor, with up to 61 cores. All of the above are shipping today on 22nm FinFET. The number of cores will certainly be greater on 14nm FinFET, and power/performance will get a substantive kicker with the upcoming Skylake microarchitecture. In short, Intel deployed impressive conventional defenses to protect themselves against conventional attacks.
If we tune our crystal ball to the precise frequency of the war room at Intel, we can probably predict what the Intel braintrust was thinking: “So far, so good – microservers really haven’t impacted the data center in a material way, and, if they do, we can answer with our entire quiver of alternatives.”
The braintrust was also probably comforted by the fact that the Intel data center fortress is surrounded by the omnipresence of the formidable x86 moat – billions of lines of legacy data center application code, already optimized and debugged with X86 servers in mind. “We are safe,” they probably thought. “No new architecture will be able to cross the moat – causing the re-write of all that working software.” Intel also took advantage of its lead in FinFET technology, especially on the Xeon product lines where the improved power/performance had the greatest impact.
The Doomsday Weapon – FPGA-based Heterogeneous Computing
Nothing is more troublesome to those relying on moat-based defenses than the arrival of a catapult. And, in mid 2014, a catapult most certainly did arrive – rolled onto the battlefield by none other than the Bing folks at Microsoft. “Catapult” was (ironically) the name of a project at Microsoft that accelerated the Bing search engine algorithms using hybrid processor FPGA computing. According to the paper referenced above, “[The Catapult Project] showed that a medium-scale deployment of FPGAs can increase ranking throughput in a production search infrastructure by 95% at comparable latency to a software-only solution. The added FPGA compute boards only increased power consumption by 10%.”
This is a big deal. No, this is a huge deal. If pairing conventional processors with FPGAs could reduce data center power consumption by a significant factor, it could be motivation enough for big armies to try swimming the moat. If it worked for Bing, it didn’t take much vision to picture players like Google, Facebook, YouTube, Amazon and Netflix poking up their periscopes to see if FPGA-based computing could lop a zero or two off of their power bills. It would be worth a significant investment in re-optimizing software to reduce power consumption by that much. The Catapult proof point showed that it could be done.
Staunch adherents to Andy Grove’s “only the paranoid survive” philosophy, the Santa Clara braintrust clearly recognized the arrival of this truly disruptive morphological technology. Indeed, Intel not only registered this new technology, they announced a hybrid Xeon-FPGA product. As we noted at the time of said announcement, a number of important items were curiously absent: the identity of the FPGA vendor (for one tiny detail) and the name of the product. It goes without saying that price and availability were also nowhere to be found.
Meanwhile, Xilinx and Altera each introduced real, tangible, working hybrid computing chip families that not only went into production, but had actual product names (Zynq and SoC FPGA, respectively). Both of these families featured complete multi-core ARM processor subsystems with very high-bandwidth on-chip connections to an FPGA fabric. Nothing to worry about for Intel yet, right? After all, these devices are intended for embedded applications, not data centers. But, clearly, the monolithic integration of these devices yielded significant advantages over simply parking processors next to FPGAs. The on-chip connection between processor and FPGA fabric brings major benefits in performance and power consumption, and nothing Intel could do simply partnering with FPGA companies would offset that potential advantage. From Intel’s perspective, it could be troubling.
Even more troubling was the fact that Altera has been very vocally touting a design flow based on OpenCL (a programming language designed to accelerate high-performance computing algorithms onto highly-parallel graphics processors) – allowing OpenCL code to be “compiled” for FPGAs. Then, late last year, Xilinx announced their SDAccel platform – clearly positioning Zynq devices for data center applications. So, there it is. Altera as a “frenemy” with their OpenCL FPGA flow, and Xilinx with a capable, monolithic hybrid computing device, already in volume production, with a real working production tool flow. It looks like Zynq and SoC FPGA are not just for embedded stuff after all. These new FPGA platforms pose a clear and present danger – missiles with considerable destructive power. And they seem to be aimed directly at Intel’s homeland.
Further increasing the potential angst, Xilinx came along about a month ago and announced that the upcoming 16nm FinFET-powered Zynq UltraScale+ devices will step up to a quad-core, 64-bit ARM Cortex-A53, along with a plethora of special purpose processing units – all tightly connected to high-performance FPGA fabric. One can imagine all of this making a serious richter-scale wiggle on the paranoia meter in Santa Clara. “What if this hybrid computing model takes off? And more to the point, what if this hybrid compute model takes off with ARM processors?” This is CLASSIC Clayton Christensen stuff: humble embedded devices, Zynq and SoC FPGA, could make themselves comfortable in the data center.
Obviously anything with any potential to take off in the data center with ARM processors would be expected to trigger “all hands on deck” at Intel. The more they studied the possibilities, the more seriously they probably took hybrid computing. We used the term “morphological” some paragraphs back, and you may still be wondering what that word means. In this context, it means taking a collection of existing technologies and merging them together to create a brand new technology. CPUs with good power/performance characteristics, check. FPGA fabric with far better power/performance characteristics, check.
And finally, an end-to-end software development flow suitable for development and deployment of data center applications – powered by advanced technologies such as high level synthesis – enabling mortal software programmers to efficiently exploit the hybrid computing model, uh, ummm… kinda’ sorta’ check… -ish? Uh, OK, maybe not really. Let’s talk about that more in a minute.
Welcome to the brave new world of FPGA-based heterogeneous computing.
Those readers paying close attention may reason that the yet-to-be-named Xeon-FPGA hybrid processors could represent a perfectly reasonable first step for Intel, if they can ever be bothered to name them. If FPGA-based hybrid computing plays more than a token role on the Intel data center roadmap, however, delivering these devices in partnership with Altera simply does not cut it. Intel needs to own all of the elements we checked off in the paragraph above.
To protect their datacenter dominance, Intel needs to own the conventional processor. It may be an Atom processor (which in 64-bit quad-core incarnations are no longer your father’s Atom) or it may be a Xeon processor, but, at the risk of stating the very obvious, Intel really, really needs it to NOT be an ARM processor. Furthermore, if the conventional processor part of the equation is x86-based, Intel gets a free pass over their own moat. All that legacy server code will run as-is, and customers can re-engineer their software to take advantage of the power/performance improvement from the FPGA fabric at their leisure.
But to overcome the inherent problems with a two-chip solution, Intel needs to own the FPGA fabric. Not just manufacture it, own it. Sure, Intel could initially use “off the shelf” Stratix devices packaged with Xeon processors, but they will quickly want hybrid-compute-specific FPGA fabric, and they will quickly want to overcome the latency, performance, and power penalty incurred by connecting the processors to the FPGA fabric by something like QPI Express. The endgame is fully integrated monolithic solutions, designed specifically for the data center, which positively calls for owning the FPGA fabric.
Intel needs to own the entire hybrid compute tool chain and drive the ecosystem, both of which are core competencies.
Software Tools – The Achilles’ Heel
The problem with the Catapult proof point is that not every data center application is as straightforward as search. If you can isolate one or two important snippets of code in your application and spend the time and energy to optimize those to take advantage of the FPGA fabric in a hybrid platform, you can reap the huge power/performance benefits. Bing and Google could afford to throw lots of talent at the problem.
But, to help the general case of data center applications, you need a tool flow that’s a lot closer to a “compiler” – something you could point at a pile of legacy code and say, “Make this work!” The software tools that can accomplish that task are still far away.
And, for Intel, even the software tools that can get FPGA fabric to do anything at all are far away. In fact, software tools are the biggest defense Altera and Xilinx wield in repelling attacks on their FPGA duopoly. Time and time again, FPGA startups have attracted funding, created impressive chips, and crashed and burned because they couldn’t get a reliable, working tool flow that enabled customers to use them. Xilinx’s and Altera’s software tools have undergone decades of evolution by hundreds of talented engineers with the benefit of tens of thousands of customer designs – proving to customers that they can reliably get from code to working FPGA, a feat that no other company has ever accomplished.
As we said in the first article, Intel could probably put every engineer they have on the task of developing FPGA tools for the next decade, and they would not be able to reach the point that Xilinx and Altera occupy today.
But Xilinx and Altera are still far short of the tool environment required to enable data center software engineers to take advantage of hybrid FPGA processors. Yes, Altera has an OpenCL methodology that can take high-performance computing code originally written for graphics processors (GPUs) and make it work on FPGAs. Xilinx has their SDAccel with HLS technology that can take algorithms written in C, C++, and OpenCL and (with the participation of competent hardware engineers) get something that would work on a hybrid platform such as their Zynq devices. In fact, just last week, Xilinx doubled down a bit – announcing their “SDSoC platform”. SDSoC is designed to enable teams with both software and hardware experts to work together to bring hybrid computing applications to life.
In order to move the development tools to the next level for data center applications, Intel needs to do more than simply partner with Xilinx or Altera. Left to their own devices, those two companies would continue their decades-old feud – fighting for supremacy in the communications infrastructure market, and trying to expand by moving their Zynq and SoC FPGA devices into emerging markets like embedded vision, industrial automation, automotive, and other applications under the “IoT” umbrella. They might eventually get around to the data center – if it’s convenient.
The Bottom Line
To defend their data center profit machine against the looming threat of ARM-based, FPGA-enabled insurgents, Intel needs to own Xilinx or Altera. They need to drive the creation of a monolithic processing platform designed for data centers that optimally integrates Intel-architecture processors with FPGA fabric. They need to provide a working, stable software development environment that allows mainstream data center applications to be easily ported – in order to take advantage of the incredible performance and power advantages that such chips could bring, while maintaining forward compatibility for legacy x86 applications. They cannot simply “partner” and achieve these goals.
Why are the current rumors pointing to Altera rather than Xilinx? Altera has already made an investment in the data center as a target market. Altera is already partnering with Intel to manufacture their next-generation 14nm FinFET devices. Altera was first to the table with the OpenCL flow.
Apparently, Altera arrived at the Intel party with their dancing shoes on.