feature article
Subscribe Now

Intel Finally Launches Sapphire Rapids: It’s All About The Accelerators, Baby

After many delays, Intel has finally launched the long-awaited Sapphire Rapids family of server processors, now named the 4th Generation Intel Xeon Scalable processors and the Intel Xeon CPU Max Series. Both names are mouthfuls, which has become typical of Intel product naming. Also typical is Intel’s ability to change the playing field to its advantage. Here, the changed playing field emphasizes the greatly boosted capabilities of the numerous hardwired accelerators and new instruction set architecture (ISA) extensions that Intel has added to these new server CPUs. The accelerators deliver truly significant performance gains relative to previous generations of Xeon CPUs and CPUs from AMD for specifically targeted and common tasks executed almost universally in data center applications including artificial intelligence (AI), networking, 5G Radio Area Networks (RANs), data encryption and security, and high-performance computing (HPC). In all, Intel is launching 52 new Xeon product SKUs.

These built-in accelerators and include:

  • Advanced Matrix Extensions (AMX): Improves the performance of deep learning training and inference. It is used to implement workloads including natural language processing, recommendation systems, and image recognition.
  • QuickAssist Technology (QAT): Offloads data encryption, decryption and compression.
  • Data Streaming Accelerator (DSA): Improves the performance of storage, networking, and data-intensive workloads by speeding up streaming data movement and transformation operations across the CPU, memory caches, and main memory, as well as all attached memory, storage, and network devices.
  • Dynamic Load Balancer (DLB): Improves overall system performance by facilitating the efficient distribution of network processing across multiple CPU cores and threads and dynamically balancing the associated workloads across multiple CPU cores as the system load varies. Intel DLB also restores the order of networking data packets processed simultaneously on CPU cores.
  • In-Memory Analytics Accelerator (IAA): Increases query throughput and decreases the memory footprint for in-memory databases and big data analytics workloads.
  • Advanced Vector Extensions 512 (AVX-512): This accelerator is the latest in the company’s long line of evolved vector instruction sets. It incorporates one or two fused multiply-add (FMA) units and other optimizations to accelerate the performance of intensive computational tasks such as complex scientific simulations, financial analytics, and 3D modeling.
  • Advanced Vector Extensions 512 for virtualized radio access network (AVX-512 for vRAN): The Intel AVX-512 extensions, specifically tuned for the needs of vRAN, deliver greater computing capacity within the same power envelope for cellular radio workloads. This accelerator helps communications service providers increase the performance-per-watt figure of merit for their vRAN designs, which helps to meet critical performance, scaling and energy efficiency requirements.
  • Crypto Acceleration: Moves data encryption into hardware, which increases the performance of pervasive, encryption-sensitive workloads such as the secure sockets layer (SSL) used in web servers, 5G infrastructure, and VPNs/firewalls.
  • Speed Select Technology (SST): Improves server utilization and reduces qualification costs by allowing public, private, and hybrid cloud customers to configure a single server to match fluctuating workloads using multiple configurations, which improves total cost of ownership (TCO).
  • Data Direct I/O Technology (DDIO): Reduces data-movement inefficiencies by facilitating direct communication between Ethernet controllers and adapters and the host CPU’s memory cache, thus reducing the number of visits to main memory, which cuts power consumption while increasing I/O bandwidth scalability and reducing latency.

Extensions to the 4th Generation Intel Xeon CPUs include:

  • Software Guard Extensions (SGX): This previously existing set of security-related extensions to the x86 instruction set architecture (ISA) allow user-level and operating system (OS) code to improve the security of workloads running in virtualized systems by defining protected private regions of memory, called enclaves. Intel claims that SGX is the most researched, updated and deployed confidential computing technology in data centers on the market today, and these extensions are used by a wide range of cloud service providers (CSPs).
  • Trust Domain Extension (TDX): These new ISA extensions, available through select cloud providers in 2023, further increases confidentiality at the virtual machine (VM) level beyond SGX. Within a TDX-protected virtual machine (VM), the guest OS and VM applications are further isolated from access by the cloud host, hypervisor, and other VMs on the platform.
  • Control-Flow Enforcement Technology (CET): These hardware-based extensions help to shut down an entire class of system memory attacks by protecting against return-oriented and jump/call-oriented programming attacks, which are two of the most common software-based attack techniques.

It’s critical to note that these new CPUs make important and strategic use of Intel’s heterogeneous, chiplet-based packaging technology to assemble as many as four processor tiles into one package. In addition, the Intel Xeon CPU Max Series uses these same packaging technologies to add two high-bandwidth memory (HBM) chiplet stacks to each CPU tile. HBM is a high-capacity stack of DRAM chiplets that act as a large, high-speed memory cache. Intel claims that the CPU Max Series is the first x86 CPU to incorporate HBM.

Intel rolled out a long list of customers and testimonials for these new CPUs. This list included testimonials from CSPs, server vendors, partners, and end users including some surprise company names. At launch, the companies providing testimonials included Amazon Web Services (AWS), Cisco, Cloudera, Dell Technologies, Ericsson, Fujitsu, Google Cloud, Hewlett Packard Enterprise, IBM Cloud, Inspur Information, Lenovo, Los Alamos National Laboratory (LANL), Microsoft Azure, Nvidia, Numenta, Oracle, Red Hat, SAP, Supermicro, Telefonica, and VMware.

Of particular note from all of these testimonials:

  • Ericsson plans to deploy these new CPUs in its Cloud RAN.
  • LANL reports seeing as much as an 8.57x improvement in some HPC workloads using pre-release CPU silicon.
  • NVIDIA is pairing Intel’s 4th Gen Xeon CPUs with NVIDIA H100 Tensor Core GPUs and NVIDIA ConnectX-7 networking for its latest generation of NVIDIA DGX systems.
  • Supermicro is incorporating the 4th Generation Intel Xeon processors and the Intel Xeon CPU Max Series into more than 50 new server models.
  • VMware will support the new CPU features in vSphere.

Intel has often changed the playing field to gain the upper hand. In the late 1970s, when Intel’s 8086 microprocessor delivered far less performance and much less capability than competing microprocessors from Motorola and Zilog, Intel mounted a superior support and software program that transformed a self-admitted dog of a processor into a world beater. Although there’s nothing dog-like about these new Xeon CPUs, Intel has once more altered the playing field in an attempt to confound AMD’s attempts to gain more market share in the server CPU space. However, AMD has proven that it is game to engage Intel on any playing field. We will need to wait and see how AMD returns this latest volley.

22 thoughts on “Intel Finally Launches Sapphire Rapids: It’s All About The Accelerators, Baby”

Leave a Reply

featured blogs
Dec 7, 2023
Building on the success of previous years, the 2024 edition of the DATE (Design, Automation and Test in Europe) conference will once again include the Young People Programme. The largest electronic design automation (EDA) conference in Europe, DATE will be held on 25-27 March...
Dec 7, 2023
Explore the different memory technologies at the heart of AI SoC memory architecture and learn about the advantages of SRAM, ReRAM, MRAM, and beyond.The post The Importance of Memory Architecture for AI SoCs appeared first on Chip Design....
Nov 6, 2023
Suffice it to say that everyone and everything in these images was shot in-camera underwater, and that the results truly are haunting....

featured video

Dramatically Improve PPA and Productivity with Generative AI

Sponsored by Cadence Design Systems

Discover how you can quickly optimize flows for many blocks concurrently and use that knowledge for your next design. The Cadence Cerebrus Intelligent Chip Explorer is a revolutionary, AI-driven, automated approach to chip design flow optimization. Block engineers specify the design goals, and generative AI features within Cadence Cerebrus Explorer will intelligently optimize the design to meet the power, performance, and area (PPA) goals in a completely automated way.

Click here for more information

featured paper

3D-IC Design Challenges and Requirements

Sponsored by Cadence Design Systems

While there is great interest in 3D-IC technology, it is still in its early phases. Standard definitions are lacking, the supply chain ecosystem is in flux, and design, analysis, verification, and test challenges need to be resolved. Read this paper to learn about design challenges, ecosystem requirements, and needed solutions. While various types of multi-die packages have been available for many years, this paper focuses on 3D integration and packaging of multiple stacked dies.

Click to read more

featured chalk talk

Portenta C33
Sponsored by Mouser Electronics and Arduino and Renesas
In this episode of Chalk Talk, Marta Barbero from Arduino, Robert Nolf from Renesas, and Amelia Dalton explore how the Portenta C33 module can help you develop cost-effective, real-time applications. They also examine how the Arduino ecosystem supports innovation throughout the development lifecycle and the benefits that the RA6M5 microcontroller from Renesas brings to this solution.  
Nov 8, 2023