Intel has just announced a major counter-strike in the battle for the data center of the future. The company announced a new suite of software tools, which (according to a blog post by Barry Davis, General Manager, Accelerated Workloads Group, Intel® Data Center Group) “make FPGA programming accessible to mainstream developers, a major leap forward for customizable silicon solutions that complement the endlessly [SIC] diversity of customer-defined workloads, including 5G network processing, artificial intelligence, data and video analytics, machine learning and more.”
We love to say “Told ya so!”
Back in 2014, around a year before Intel announced their intent to acquire Altera, we ran an article that asked “When Intel Buys Altera – Will FPGAs Take Over the Data Center?” We weren’t just predicting in the dark. At the Gigaom Structure 2014 event, Intel’s Diane Bryant announced that Intel would be “integrating [Intel’s] industry-leading Xeon processor with a coherent FPGA in a single package, socket compatible to [the] standard Xeon E5 processor offerings.”
That got our attention.
Intel wasn’t giving any additional info on the topic – not even the identity of the FPGA they were planning to use. But we wouldn’t be very good journalists if we couldn’t narrow from a possible field of two. The other option, Xilinx, was charging ahead full steam with a strong field of Intel competitors as partners to address the huge opportunity of FPGA-based acceleration. Altera had jumped to an early lead in recognizing the data center opportunity, and it just so happened that Intel was already Altera’s manufacturing partner, AND Intel had already shipped another product – the Atom E600C (better known, perhaps, by its original code name “Stellarton”), which combined an Altera FPGA with an Intel Atom processor to create what turned out to be an epic failure in the embedded market.
If at first you don’t succeed…
But the data center is a whole nother ballgame compared with the embedded market. Intel has struggled (alternative spelling “FAILED”) in their attempts to capture a meaningful presence in embedded, but they OWN the data center – with something like a 6,000% market share. (Our numbers may be a little off there, but you get the idea.) This is a business that has a value to Intel in the double-digit billions per year. It matters. And here sat Altera with what appeared to be a nice lead in the whole concept of capitalizing on the coming data center discontinuity. Plus, “Hey, we’re already building their chips for them!” When you’ve got synergy like that (and an extra $16B or so in cash lying around), why NOT make the move and acquire Altera?
The problem is, buying big companies takes time. Making new data center processors takes more time. Perhaps the most important and difficult part – coming up with a suitable software stack to support programming a whole new kind of data center processor – takes a LOT more time. But market discontinuities don’t just wait around for the previously dominant player to get their act together. Intel’s competitors had already jumped in with both feet. In 2014, Xilinx announced “SDAccel,” which the company bills as a “development environment for OpenCL™, C, and C++ [that] enables up to 25X better performance/watt for data center application acceleration leveraging FPGAs.” Intel’s clock was ticking.
In the time that followed, Intel’s competitors strengthened their attack. With the advent of new critical applications (particularly AI), a data center discontinuity of epic proportions was looming. Data centers would need acceleration to meet the performance, and, particularly, the power requirements, of the future. If a dominant player like Intel was to be dislodged, this was the time to do it.
Now, three years later, Intel is getting their defenses deployed. With Altera firmly in the family, the company has launched the long-promised new line of data center processors. Altera has launched its Stratix 10 and Arria 10 FPGAs, and now they are following with the software stack. The new announcement has three major components: The Acceleration Stack for Intel Xeon CPU with FPGAs, the Open Programmable Acceleration Engine (OPAE) Technology, and the Intel FPGA Software Development Kit (SDK) for OpenCL*
The Acceleration Stack for Intel® Xeon® CPU with FPGAs consists of software, firmware and tools, designed to make it easier to develop and deploy Intel FPGAs for workload optimization in the data center. It includes hardware interfaces and associated software APIs for interfacing Xeon processors to FPGAs.
Intel’s Open Programmable Acceleration Engine (OPAE) is “a software programming layer that provides a consistent API across FPGA product generations and platforms. It is designed for minimal software overhead and latency, while providing an abstraction for hardware specific FPGA resource details.” Intel has open sourced the technology in an attempt to gain broad adoption in the FPGA-based acceleration arena.
The Intel® FPGA SDK for Open Computing Language (OpenCL™) is kit that abstracts away the RTL-based FPGA development process using a higher level software development flow. It allows you to emulate your OpenCL C accelerator code on an x86-based host, and it gives a detailed optimization report with specific algorithm pipeline dependency information. This lets you develop your OpenCL in a tight iteration loop on an x86 machine without having to run through the much longer synthesis->place-and-route FPGA flow until the end. Intel says you can also “leverage prewritten optimized OpenCL or register transfer level (RTL) functions, calling them from the host or directly from within your OpenCL kernels.”
It is impossible to overstate the importance of this development flow in the adoption of FPGA acceleration in data centers. Competitively, Altera had a big head start with OpenCL prior to the Intel acquisition, and Xilinx has played catch-up in that arena. On the other hand, Xilinx was out first with a comprehensive development stack similar to the one Intel just announced. One clear advantage Intel enjoys is the dominance of Xeon/x86 in the data center, and their ability to pair tool flows with that architecture.
In the past, we talked a lot about the importance of the ability to run legacy data center software on an accelerated platform. But because of the rapid uptake of new applications – particularly AI and neural networks – legacy code may end up being much less important than performance, power-efficiency, and portability of AI applications. Rather than wooing the software developers, it may be the data scientists with the most clout in future data center decisions who need to be convinced of the viability of an accelerated data center platform.
Both camps have scored significant early victories in this computing revolution. Xilinx won well-publicized sockets in Baidu and Amazon, Intel/Altera have had a number of high-profile victories at Microsoft, and there are undoubtedly other deals on both sides that have not been publicized. It is interesting to note that there is a bit of an architecture difference emerging between the two camps, with Intel/Altera leaning toward a paired architecture where one FPGA is paired with one Xeon. Xilinx is showing up in pooled architectures where a pool of many FPGAs is deployed as a networked resource. It’s unclear how much impact this difference will have on those developing applications for these two architectures, or what the real-world performance and efficiency differences will be. It will be interesting to watch.