feature article
Subscribe Now

FPGAs Duel in the Data Center

Key Xilinx and Intel/Altera Strategies Diverge

When there are only two competitors in a race, the tactics change dramatically. Winning is no longer necessarily a matter of simply going as fast as possible. In bicycle match sprints, the winning strategy is actually to stay behind, drafting the leading bike until seconds before the finish line, then catapulting past for the win with a burst of speed built in the wind shadow of the unfortunate leader. In yacht match racing, “covering” is the proven way to victory – mimicking the moves of the rival, and only rarely taking the risk of diverging in order to gain the advantage.

The high-end FPGA market has always been a match race between Xilinx and Altera. For decades, the two companies have jockeyed for position, each trying to outsmart and outrun their adversary with both technological prowess and marketing cunning. When Altera gained ground by using TSMC as their fab partner, Xilinx covered and neutralized that advantage by moving to TSMC themselves. Then, Altera took a risk and jumped to Intel’s fabs for their high-end devices. When one company moved to wider LUT structures, the other followed. When one overhauled their tool suite, the other countered. At every level, each novel innovation on either side was fast-followed by an answer or countermeasure by the opponent.

Now, the FPGA bragging rights battle has moved to the data center, which could be by far the largest new market for FPGA technology in decades. FPGAs can improve every aspect of data-center operation. They can outperform every other networking option with their ability to create fast, software-defined structures that send packets to their final destinations with maximum speed and minimum power consumption. They can compress and optimize the mountains of data being crammed into storage facilities. And, perhaps most interesting of all, they can accelerate computation while dramatically reducing power consumption.

Last year, Intel spent over $16B to acquire Altera, and that’s a lot of coin for a company of Altera’s size (around $2B annual sales) and growth rate. In fact, it’s so much coin that we believe (and Intel has said) that there were key strategic reasons for the acquisition that justified the premium. Obviously, (as we’ve said before) one of the biggest strategic reasons would be to protect and future-proof Intel’s dominance in the much larger data-center business.

Sure enough, Intel has announced that it is pursuing a number of tactics to take advantage of Altera’s FPGA prowess combined with Intel’s Xeon-everywhere-ness, creating new solutions for the data center that combine FPGAs with conventional processors to accelerate and reduce power consumption on key high-load applications. But long-time Altera rival Xilinx is not content to just sit around and wait for the Intel cloud to bring doom and gloom to their data center ambitions.

Watching the nascent strategies for both companies, we are starting to see a divergence of paths – fundamental differences in approach and architecture that will likely define the rules of the battle for years to come. Neither Xilinx nor Intel/Altera is either content or properly positioned to merely cover the other’s strategic moves. Each has unique advantages that they must exploit in order to come out on top, and it isn’t clear which company’s course will take them to data-center dominance in the coming decades.

The first major divergence is esoteric, architectural, and subtle, but it has potentially monumental implications. A few years ago, Altera decided to take bold steps toward compute acceleration in the data center. One of the key perceived deficiencies of FPGAs in computation has always been floating point. FPGAs brought enormous fixed-point power to the table via scores of optimized DSP blocks, but floating-point computation was much less efficient. Altera aimed to address that deficiency by adding optimized, hardened floating-point support to their FPGAs through an overhaul of their DSP blocks, adding IEEE 754 single-precision hardened floating-point DSP to the mix.

According to Altera/Intel, this floating-point support is a key advantage, bringing potential TeraFLOPS performance to FPGA-accelerated tasks that is unmatched by Xilinx’s offering. According to Xilinx, Altera’s addition of floating-point support came at the hidden cost of significantly worse performance on narrow fixed-point operations, of the kind required by neural network inference algorithms. AI Neural networking is a key application for data centers in the coming decades, so winning at AI is potentially one of the best routes to the bank.

In deep learning, the first step is “training,” where the system learns its job by analyzing mountains of raw data. Once training is complete, the mode changes to “inference,” where the system applies its knowledge to situations in the real world. According to Xilinx, the big potential market in AI is inference, which must be much more broadly deployed than training. If training requires intense floating-point computation and inferencing requires massive small-bit-width fixed-point computation, Xilinx could gain an advantage with faster fixed point, and Altera’s floating point could be a liability. If Altera/Intel is right and floating point dominates the acceleration-using applications in the data center, Altera’s floating point support is a formidable weapon.

Moving our discussion to other strategy differences – before Intel even hinted at acquiring Altera, they announced that they planned to create new processors that combined their ubiquitous Xeon with an FPGA in the same package. Clearly the plan was to offload compute loads at a very fine-grained level, with a single processor having an FPGA “buddy” available via high-bandwidth, low-latency connection to accelerate compute-intensive operations. Because Intel now owns both the FPGA and the processor pieces of this puzzle, they have ultimate flexibility to optimize this processor-plus-FPGA-device architecture, taking advantage of proprietary EMIB packaging technology to build fast, tight bonds between processor, FPGA, and high-bandwidth memory. Clearly, in this fine-grained race, Intel/Altera has a major strategic advantage.

But, fine-grained architecture isn’t the only approach to FPGA-based acceleration. An alternative school of thought is to pool FPGA resources into clusters where multiple servers can share them in a hyperscale fashion. This is the approach used by Amazon, for example, according to a recent announcement that Xilinx devices will be deployed in a new “Amazon FPGA cloud.” By freeing the FPGA resources to float where they are needed most, rather than chaining each to a corresponding processor, Xilinx claims that utilization of FPGA compute capacity will be much higher. 

The question of whether the fine-grained or the cluster approach to FPGA acceleration is far from answered. With the two competitors seeming to head down drastically different paths in this area, it could be a major factor in determining the winner. Of course, Xilinx itself could be bought out any day, and a more intimate relationship with a maker of processors could cause Xilinx to jump to the Intel/Altera fine-grained track.

Clearly, though, the largest single obstacle to wide deployment of FPGA-based acceleration in data centers is software development. FPGAs bring incredible potential to data center computing. We’ve estimated that three orders of magnitude in performance-per-watt is achievable just by optimizing the use of FPGAs as accelerators. The problem has always been programming them. Porting a software application to a heterogeneous reconfigurable computing platform with both FPGAs and conventional processors currently requires a massive investment of engineering expertise and manpower. 

Here too, the two companies are taking significantly different approaches. Altera’s first shot was their initiative to allow FPGAs to be programmed with OpenCL, whereas Xilinx has worked for years on advancing high-level synthesis (HLS) technology. allowing sequential algorithms written in C or C++ to be optimized for FPGA-based implementation. Both strategies today are arguably still in their infancy, with Altera laboring to prove to the GPU crowd that they can target their code to FPGAs with superior results, and with Xilinx working to achieve wide-scale deployment of HLS tools for FPGA design.

We’d have to give Xilinx the advantage on the high-level tools race at this point, but Intel/Altera’s already-long history with their OpenCL approach has them ahead on the experience front. Both companies have a long, long way to go before we have a tool environment that facilitates wide-scale general-purpose deployment of FPGA-based acceleration, however. We are still clearly in an early-adopter phase where only the best-funded data-center customers can avail themselves of FPGAs. 

Speaking of the best-funded data-center customers, the so-called “Super 7” (which includes Facebook, Google, Microsoft, Amazon, Baidu, Alibaba, and Tencent) constitute an enormous amount of data center opportunity – both in terms of the sheer amount of business they represent, and because of their influence on the rest of the data center industry. Obviously, Xilinx’s recent Amazon FPGA Cloud win is a major victory in terms of the company’s ability to beat out Altera/Intel at acquiring a Super 7 customer. But the win may have even more important implications in terms of the software development battlel.

By deploying Xilinx FPGAs in Amazon Cloud, Xilinx’s architecture, tools, and technology become the default for the numerous companies taking advantage of Amazon’s services. If you’re already an Amazon Cloud customer and you want to accelerate your application with FPGAs, you are kinda automatically signed up for Xilinx and the coarse-grained pooled-FPGA approach. This makes the Amazon win have potential for Xilinx that far exceeds simply winning a single Super 7 deal against Intel/Altera. It could promote a ramp-up of application development that favors Xilinx’s architectural approach.

Intel has historically relied on what we call the “x86 moat” to defend their dominance in data center processing. But, with FPGAs poised to become major components of data centers moving forward, the ability to efficiently handle legacy x86-optimized software could take a back seat to the new requirement to take advantage of the energy and performance benefits of FPGAs. This could represent a huge discontinuity in data center design, which could also lead to major market shifts in supplying gear to those data centers.

While Intel made a brilliant strategic move in bringing Altera into the fold, there are still significant battles to be fought in determining who will win the lion’s share of the enormous data-center opportunity over the next two decades. It will be interesting to watch.


Leave a Reply

featured blogs
Dec 3, 2021
Believe it or not, I ran into John (he told me I could call him that) at a small café just a couple of evenings ago as I pen these words....
Dec 3, 2021
The annual Design Automation Conference (DAC) is coming up December 5th to 9th, next week. It is in-person in San Francisco's Moscone Center West. It will be available virtually from December... [[ Click on the title to access the full blog on the Cadence Community site...
Dec 1, 2021
We discuss semiconductor lithography and the importance of women in engineering with Mariya Braylovska, Director of R&D for Custom Design & Manufacturing. The post Q&A with Mariya Braylovska, R&D Director, on the Joy of Solving Technical Challenges with a...
Nov 8, 2021
Intel® FPGA Technology Day (IFTD) is a free four-day event that will be hosted virtually across the globe in North America, China, Japan, EMEA, and Asia Pacific from December 6-9, 2021. The theme of IFTD 2021 is 'Accelerating a Smart and Connected World.' This virtual event ...

featured video

Imagination Uses Cadence Digital Full Flow for GPU Development

Sponsored by Cadence Design Systems

Learn how Imagination Technologies uses the latest Cadence digital design and simulation solutions to deliver leading-edge GPU technology for automotive, mobile, and data center products.

Click here to learn more about Cadence’s digital design and signoff solutions

featured paper

Using the MAX66242 Mobile Application, the Basics

Sponsored by Analog Devices

This application note describes the basics of the near-field communication (NFC)/radio frequency identification (RFID) MAX66242EVKIT board and gives an application utilizing the NFC capabilities of iOS and Android® based mobile devices to exercise board functionality. It then demonstrates how the application enables use of memory and secure features in the MAX66242. It also shows how to use the MAX66242 with an onboard I2C temperature sensor, demonstrating the device's energy harvesting feature.

Click to read more

featured chalk talk

Industrial CbM Solutions from Sensing to Actionable Insight

Sponsored by Mouser Electronics and Analog Devices

Condition based monitoring (CBM) has been a valuable tool for industrial applications for years but until now, the adoption of this kind of technology has not been very widespread. In this episode of Chalk Talk, Amelia Dalton chats with Maurice O’Brien from Analog Devices about how CBM can now be utilized across a wider variety of industrial applications and how Analog Device’s portfolio of CBM solutions can help you avoid unplanned downtime in your next industrial design.

Click here for more information about Analog Devices Inc. Condition-Based Monitoring (CBM)