How much system security is too much? By the way, that’s a rhetorical question. No matter how much security you put into a system’s design, if the protected data is valuable enough or if bad actors just think the data’s valuable enough, they will try to steal it, corrupt it, or counterfeit it. Processor vendors spend an increasing amount of time and devote a growing number of design resources to develop new security schemes for their processors.
For example, AMD offers a variety of security features in its various Ryzen processors, including AMD Memory Guard for memory encryption, AMD Shadow Stack to help detect and thwart control-flow attacks, and AMD Secure Boot to extend the AMD silicon root of trust by establishing an unbroken chain of trust from the AMD silicon root of trust to the BIOS. AMD has also incorporated the Microsoft Pluton Security Processor, which specifically hardens Windows 11 PCs against myriad attacks.
For its part, Intel Xeon processors offer SGX (Software Guard Extensions), a technology that employs hardware-based memory encryption to isolate application code and data in memory. The Intel Converged Security and Management Engine (CSME), a separate, on-chip security controller that’s isolated from the processor’s CPU, serves as the system’s root of trust for components in Intel-based systems. Intel even developed an external component called the Intel Platform Firmware Resilience (PFR) module, which implements additional hardware-based firmware security using an Intel MAX 10 FPGA. Intel offers PFR as an option for the company’s 3rd-Gen Intel Xeon Processor platforms. The list of additional security features that AMD and Intel have taken in response to new threats is long and getting longer. Other processor vendors take similar but different ad hoc approaches to creating secure systems.
Meanwhile, Axiado has surveyed these ad hoc approaches to system security and decided there must be a more logical way to add security to systems. The work following that observation has produced the AX3000 and AX2000 trusted control/compute units (TCUs), which the company claims are “the world’s first fully integrated AI-driven hardware security platform solutions designed to help detect cybersecurity and ransomware attacks on next-generation servers and infrastructure elements in cloud data centers, 5G networks, and other disaggregated compute networks.”
Axiado is not alone in sensing that the time has come for a more rigorous approach to platform-level security. The Open Compute Project (OCP) also recognized the dire need for platform-level security and developed the Data Center Secure Control Module (DC-SCM) specification to help standardize the way that designers can add security features to data center servers by specifying a common management and security unit that can be deployed across a number of server platforms. The OCP’s DC-SCM subgroup first published a preliminary specification in November 2020. Based on feedback from the community, the OCP DC-SCM subgroup issued a version 1.0 specification in March 2021. The DC-SCM hardware takes the form of a plug-in card (with two form factors: horizontal and vertical) and requires a special slot that implements a unique Data Center-ready Secure Control Interface (DC-SCI). Axiado has designed two DC-SCM cards using the OCP specification’s horizontal and vertical form factors using its AX2000 and AX3000 TCUs, but these devices can also be used to secure many other systems as well.
Essentially, the AX2000 and AX3000 TCUs are SoCs that incorporate a menagerie of processors, standard interfaces, and dedicated security software, all fed by a data lake that’s continuously stocked with detailed information about the ever-widening variety of security exploits. The combination of these elements produces a security processing unit that can monitor a range of system behaviors and parameters and can trigger exception processing when it senses anomalous system behavior.
What might those anomalous behaviors be? At the lowest level, the Axiado TCUs can act as network firewalls. However, these SoCs are not limited to Ethernet connections, as shown in Figure 1 below, so they can also monitor firmware boot transfers over QSPI, PCIe, eSPI, I3C, or other interfaces. They can also monitor communications between the CPU or GPU and peripherals including storage devices over PCIe or Ethernet, and they can monitor system temperatures to sense when the server’s processing elements are being overloaded or when some sort of supply-voltage exploit might be underway.
Figure 1: A variety of interface ports allow the Axiado TCUs to connect with all types of system components to watch for anomalous system behavior by monitoring inter-component communications, operating voltages, and temperatures. Image credit: Axiado
The Axiado TCU consists of three major sections, shown in Figure 2. The first section contains multiple applications processors, all based on an Arm ISA. These application processors execute the security-monitoring algorithms. The second section, which Axiado calls the “Secure Vault,” implements a hardware root of trust, which is essential for any hardware security system worth its salt, and a TPM or Trusted Platform Module, which is the SoC’s cryptographic processor for hardware crypto acceleration. The Secure Vault design is based on a RISC-V processor core. The third section, present only in the AX3000 TCU, implements secure AI algorithms to detect a variety of security exploits such as ransomware attacks, network attacks, and side-channel attacks. Here, side-channel attacks refer to security exploits that attempt to break into non-CPU server components such as the storage subsystem. In addition, the TCU’s AI section attempts to detect other sorts of anomalous behavior. Data to feed the algorithms running in these AI engines reside in an off-chip data lake, which can be hosted by Axiado in the cloud or locale in on-premises resources.
Figure 2: The Axiado TCU consists of three major sections: the application processors, the Secure Vault, and programmable AI engines. Image credit: Axiado
Connectivity plays a significant role in the Axiado TCUs’ ability to monitor a server system for indications of an attack. Monitoring interfaces include multiple PCIe, Ethernet, I3C and I2C, USB, QSPI, and UART ports. Another difference between the AX3000 and AX2000 TCUs is that the AX3000 TCU supports 10Gb and 1Gb Ethernet while the AX2000 TCU supports only 1GB Ethernet.
It occurs to me that Axiado’s TCUs and the OCP’s DC-SCM concept somewhat resemble Intel’s original idea for Infrastructure Processing Units: improve system security by isolating major compute, storage, memory, and I/O components. Many of the security tasks seem similar. However, the Axiado TCUs and the DC-SCM are designed to do this job by monitoring various operational aspects of one server, while Intel’s early IPU vision was to have the IPU orchestrate the operation of a server rack or the entire data center. (See “Ooh, Ooh that Smell! Can Intel’s IPUs Clean up the Cloud Data Center Mess?”)
It remains to be seen which concepts will prevail in the forever war of data center security. One thing is certain. The war will continue. Data center and server architects will continue to build higher and higher walls around their castles to deflect security, while bad actors will continue to seek, find, and exploit the inevitable chinks in those security walls.