feature article
Subscribe Now

Xilinx Unveils U55C Data Center Card

More Power for Less Power

With Supercomputing 2021 underway this week, all eyes are focused on high-performance computing (HPC) and the incredible advances we are seeing in the world’s fastest computers. (OK, not really ALL eyes, some are still focused on Tik Tok and… well, you know).

The landscape in HPC is changing rapidly, with supercomputers playing a much larger role in solving the world’s most critical problems. With direct impact on crises such as climate change and the global pandemic, and the rise of new and challenging workloads such as AI, supercomputing is spreading its wings and having a tangible impact on the everyday life of everyone on Earth. 

To mis-paraphrase Peter Parker, handling this great responsibility requires great power. And, that means great computing power as well as copious quantities of electrical energy. Since the death of Dennard Scaling for processors, our ability to increase compute performance has come primarily from parallelism – piling more processors into each package, rack, and room. And, since parallelism scales more-or-less infinitely, the real-world limiting factor for what we can achieve in computing performance is energy. 

Moving forward, energy efficiency is likely to be the most critical consideration for HPC. 

Our old buddy the Von Neumann processor carries the load for the vast majority of computers in the world today. But, for all his great attributes, Von Neumann is not particularly adept at power efficiency. Compounding the problem, the massive amount of data being created today causes crushing problems with memory size and bandwidth on applications that need to consume, store, and process that data. And, all that memory activity translates into more energy consumption.

For decades, FPGAs have had the potential to accelerate broad classes of applications with substantially better energy efficiency than conventional processors. But, taking advantage of the parallel processing capabilities of FPGAs required significant engineering investment, and required rare talent in FPGA design. And, in order to deploy FPGAs in HPC systems, FPGA design skill was required in both the hardware and software realms.

A few years ago, Xilinx declared that they were going “data center first” in their pursuit of emerging markets for FPGAs and programmable logic technology. That meant they needed to overcome both the hardware and software barriers in order to get their silicon into systems where it could shine. And, they were working at a disadvantage as Intel, their primary rival in the FPGA business (formerly Altera), had a death grip on the data center market. That meant Xilinx was swimming upstream trying to get their chips into data centers and supercomputers.

Two of the key elements of Xilinx’s strategy were their Alveo pre-designed accelerator cards, and their Vitis unified software platform, addressing the hardware and software deployment challenges. Alveo forced Xilinx from their usual comfort zone – selling components – to selling systems/solutions. And, Vitis took them away from their historic user – the RTL-savvy digital designer – to the much broader audience of software developers. Both of these required rethinking the way the company did business. 

Now, three years or so into the data center endeavor, Xilinx is announcing a new, much more capable Alveo card – the U55C, along with a standards-based API-driven clustering solution that allows them to be deployed at a massive scale – upwards of a thousand FPGAs in a system. The card is a single-slot full height, half length (FHHL) form factor with 150W max power. It doubles the amount of ultra-high-bandwidth HBM2 per FPGA to 16GB (compared with the previous, dual-slot Alveo U280). It offers superior compute density. HBM2 is key in many HPC applications which are memory bandwidth limited.  U55C also packs increased compute density in a smaller form factor, with higher power efficiency. Xilinx says U55C is “built for high-density streaming data, high IO math, and big compute problems that require scale out like big data analytics and AI applications.”

Xilinx designed Alveo to work with existing data center infrastructure and network, giving them the vehicle they need to cross the Intel moat into the well-fortified data center and HPC markets. The new RoCE v2-based clustering solution enables customers to build large FPGA-based HPC clusters on top of their existing infrastructure – without having to hire a team of FPGA experts. The API-driven clustering solution takes advantage of RoCE v2 standards and data center bridging coupled with 200 Gbps bandwidth. This enables an Alveo network to compete with InfiniBand networks in performance and latency, without the vendor lock-in. 

Of course, once your fancy hardware is in the system, you still have to program it. 

That’s where the Vitis unified software platform comes in. HPC developers can develop using normal, high-level languages and AI frameworks. scaling out data pipelining with shared workloads and memory across hundreds of Alveo cards, independent of what server platform and network they use.

Xilinx has come a long way with the Vitis development platform over the last few years, making FPGA-accelerated computing more readily available to to software developers and data scientists without FPGA or hardware expertise. Vitis is somewhat more focused than Intel’s oneAPI framework, but shares the common goal of simplifying the task of deploying applications on complex, heterogeneous computing hardware. 

Vitis supports major AI frameworks such as Pytorch and Tensorflow, as well as high-level programming languages like C, C++ and Python. It abstracts away the usual FPGA challenges such as RTL design, synthesis, layout, and tining closure. Xilinx also provides a growing library of pre-optimized IP, giving developers a jump start on deploying many typical HPC applications in existing data centers.

The Alveo U55C card is currently available  from Xilinx and distributors. The company also provides evaluation via public cloud-based FPGA-as-a-Service providers, as well as select colocation data centers for private previews. Clustering is available now for private previews, with general availability expected in the second quarter of next year.

Obviously, the Alveo strategy plays nicely with the impending acquisition of Xilinx by AMD, and will be a key weapon as AMD continues to do battle with Intel over the fertile ground of data center and HPC computing. It will be interesting to watch.

Leave a Reply

featured blogs
May 26, 2022
Introducing Synopsys Learning Center, an online, on-demand library of self-paced training modules, webinars, and labs designed for both new & experienced users. The post New Synopsys Learning Center Makes Training Easier and More Accessible appeared first on From Silico...
May 25, 2022
The Team RF "μWaveRiders" blog series is a showcase for Cadence AWR RF products. Monthly topics will vary between Cadence AWR Design Environment release highlights, feature videos, Cadence... ...
May 25, 2022
There are so many cool STEM (science, technology, engineering, and math) toys available these days, and I want them all!...
May 24, 2022
By Neel Natekar Radio frequency (RF) circuitry is an essential component of many of the critical applications we now rely… ...

featured video

Increasing Semiconductor Predictability in an Unpredictable World

Sponsored by Synopsys

SLM presents significant value-driven opportunities for assessing the reliability and resilience of silicon devices, from data gathered during design, manufacture, test, and in-field. Silicon data driven analytics provide new actionable insights to address the challenges posed to large scale silicon designs.

Learn More

featured paper

5 common Hall-effect sensor myths

Sponsored by Texas Instruments

Hall-effect sensors can be used in a variety of automotive and industrial systems. Higher system performance requirements created the need for improved accuracy and more integration – extending the use of Hall-effect sensors. Read this article to learn about common Hall-effect sensor misconceptions and see how these sensors can be used in real-world applications.

Click to read more

featured chalk talk

Sensor Technologies Here to Stay: Post-pandemic

Sponsored by Infineon

Today sensor technology has become integral to our everyday lives. And in the future, sensor technology will mean even more than it does today. In this episode of Chalk Talk, Amelia Dalton chats with David Jones from Infineon about the future of sensor technologies and how they are going to impact our lives in the post-pandemic world. They investigate how miniaturization, built-in antennas in-package and the evolution of radar technology have helped usher in a whole new era of sensing technologies and how all of this and more will help us live healthier and happier lives.

Click here for more information about Infineon's sensor technology portfolio