feature article
Subscribe Now

Xilinx Unveils U55C Data Center Card

More Power for Less Power

With Supercomputing 2021 underway this week, all eyes are focused on high-performance computing (HPC) and the incredible advances we are seeing in the world’s fastest computers. (OK, not really ALL eyes, some are still focused on Tik Tok and… well, you know).

The landscape in HPC is changing rapidly, with supercomputers playing a much larger role in solving the world’s most critical problems. With direct impact on crises such as climate change and the global pandemic, and the rise of new and challenging workloads such as AI, supercomputing is spreading its wings and having a tangible impact on the everyday life of everyone on Earth. 

To mis-paraphrase Peter Parker, handling this great responsibility requires great power. And, that means great computing power as well as copious quantities of electrical energy. Since the death of Dennard Scaling for processors, our ability to increase compute performance has come primarily from parallelism – piling more processors into each package, rack, and room. And, since parallelism scales more-or-less infinitely, the real-world limiting factor for what we can achieve in computing performance is energy. 

Moving forward, energy efficiency is likely to be the most critical consideration for HPC. 

Our old buddy the Von Neumann processor carries the load for the vast majority of computers in the world today. But, for all his great attributes, Von Neumann is not particularly adept at power efficiency. Compounding the problem, the massive amount of data being created today causes crushing problems with memory size and bandwidth on applications that need to consume, store, and process that data. And, all that memory activity translates into more energy consumption.

For decades, FPGAs have had the potential to accelerate broad classes of applications with substantially better energy efficiency than conventional processors. But, taking advantage of the parallel processing capabilities of FPGAs required significant engineering investment, and required rare talent in FPGA design. And, in order to deploy FPGAs in HPC systems, FPGA design skill was required in both the hardware and software realms.

A few years ago, Xilinx declared that they were going “data center first” in their pursuit of emerging markets for FPGAs and programmable logic technology. That meant they needed to overcome both the hardware and software barriers in order to get their silicon into systems where it could shine. And, they were working at a disadvantage as Intel, their primary rival in the FPGA business (formerly Altera), had a death grip on the data center market. That meant Xilinx was swimming upstream trying to get their chips into data centers and supercomputers.

Two of the key elements of Xilinx’s strategy were their Alveo pre-designed accelerator cards, and their Vitis unified software platform, addressing the hardware and software deployment challenges. Alveo forced Xilinx from their usual comfort zone – selling components – to selling systems/solutions. And, Vitis took them away from their historic user – the RTL-savvy digital designer – to the much broader audience of software developers. Both of these required rethinking the way the company did business. 

Now, three years or so into the data center endeavor, Xilinx is announcing a new, much more capable Alveo card – the U55C, along with a standards-based API-driven clustering solution that allows them to be deployed at a massive scale – upwards of a thousand FPGAs in a system. The card is a single-slot full height, half length (FHHL) form factor with 150W max power. It doubles the amount of ultra-high-bandwidth HBM2 per FPGA to 16GB (compared with the previous, dual-slot Alveo U280). It offers superior compute density. HBM2 is key in many HPC applications which are memory bandwidth limited.  U55C also packs increased compute density in a smaller form factor, with higher power efficiency. Xilinx says U55C is “built for high-density streaming data, high IO math, and big compute problems that require scale out like big data analytics and AI applications.”

Xilinx designed Alveo to work with existing data center infrastructure and network, giving them the vehicle they need to cross the Intel moat into the well-fortified data center and HPC markets. The new RoCE v2-based clustering solution enables customers to build large FPGA-based HPC clusters on top of their existing infrastructure – without having to hire a team of FPGA experts. The API-driven clustering solution takes advantage of RoCE v2 standards and data center bridging coupled with 200 Gbps bandwidth. This enables an Alveo network to compete with InfiniBand networks in performance and latency, without the vendor lock-in. 

Of course, once your fancy hardware is in the system, you still have to program it. 

That’s where the Vitis unified software platform comes in. HPC developers can develop using normal, high-level languages and AI frameworks. scaling out data pipelining with shared workloads and memory across hundreds of Alveo cards, independent of what server platform and network they use.

Xilinx has come a long way with the Vitis development platform over the last few years, making FPGA-accelerated computing more readily available to to software developers and data scientists without FPGA or hardware expertise. Vitis is somewhat more focused than Intel’s oneAPI framework, but shares the common goal of simplifying the task of deploying applications on complex, heterogeneous computing hardware. 

Vitis supports major AI frameworks such as Pytorch and Tensorflow, as well as high-level programming languages like C, C++ and Python. It abstracts away the usual FPGA challenges such as RTL design, synthesis, layout, and tining closure. Xilinx also provides a growing library of pre-optimized IP, giving developers a jump start on deploying many typical HPC applications in existing data centers.

The Alveo U55C card is currently available  from Xilinx and distributors. The company also provides evaluation via public cloud-based FPGA-as-a-Service providers, as well as select colocation data centers for private previews. Clustering is available now for private previews, with general availability expected in the second quarter of next year.

Obviously, the Alveo strategy plays nicely with the impending acquisition of Xilinx by AMD, and will be a key weapon as AMD continues to do battle with Intel over the fertile ground of data center and HPC computing. It will be interesting to watch.

Leave a Reply

featured blogs
Sep 30, 2022
Wow, September has flown by. It's already the last Friday of the month, the last day of the month in fact, and so time for a monthly update. Kaufman Award The 2022 Kaufman Award honors Giovanni (Nanni) De Micheli of École Polytechnique Fédérale de Lausanne...
Sep 29, 2022
We explain how silicon photonics uses CMOS manufacturing to create photonic integrated circuits (PICs), solid state LiDAR sensors, integrated lasers, and more. The post What You Need to Know About Silicon Photonics appeared first on From Silicon To Software....
Sep 22, 2022
On Monday 26 September 2022, Earth and Jupiter will be only 365 million miles apart, which is around half of their worst-case separation....

featured video

PCIe Gen5 x16 Running on the Achronix VectorPath Accelerator Card

Sponsored by Achronix

In this demo, Achronix engineers show the VectorPath Accelerator Card successfully linking up to a PCIe Gen5 x16 host and write data to and read data from GDDR6 memory. The VectorPath accelerator card featuring the Speedster7t FPGA is one of the first FPGAs that can natively support this interface within its PCIe subsystem. Speedster7t FPGAs offer a revolutionary new architecture that Achronix developed to address the highest performance data acceleration challenges.

Click here for more information about the VectorPath Accelerator Card

featured paper

Algorithm Verification with FPGAs and ASICs

Sponsored by MathWorks

Developing new FPGA and ASIC designs involves implementing new algorithms, which presents challenges for verification for algorithm developers, hardware designers, and verification engineers. This eBook explores different aspects of hardware design verification and how you can use MATLAB and Simulink to reduce development effort and improve the quality of end products.

Click here to read more

featured chalk talk

E-Mobility: Electronic Challenges and Solutions

Sponsored by Mouser Electronics and Würth Elektronik

The future electrification of the world’s transportation industry depends on the infrastructure we create today. In this episode of Chalk Talk, Amelia Dalton chats with Sven Lerche from Würth Elektronik about the electronic challenges and solutions for today’s e-mobility designs and EV charging stations. They take a closer look at the trends in these kinds of designs, the role that electronic parts play in terms of robustness, and how Würth’s REDCUBE can help you with your next electric vehicle or EV charging station design.

Click here for more information about Würth Elektronik Automotive Products