Spinning Rust Gets an Upgrade

“… we used cassette tape [in school] because we didn’t have floppy disk drives.” – Parker Harris, co-founder of Salesforce.com

When you say, “I have Linux on my hard drive,” it means you have the operating system image stored on disk. Soon, though, it could mean it’s running on your hard drive. The drive itself will be running Linux. Yikes.

That’s the vision promulgated by microprocessor vendor ARM with its new Cortex-R82 design. It’s designed to smart-ify hard drives and SSDs (solid-state disks), to the point where they’re standalone computers in their own right. It pushes the concept of computational storage to its logical conclusion, where hard disks do their own data analysis locally, rather than just serve up bitstreams for some other computer to manage. The idea has been around for a while, but actual deployment has been limited mostly to academic and experimental installations.

The concept also leads to some weird technical, if not philosophical, thought experiments. Where’s the dividing line between disk drive and server? What happens when your storage is smarter than the computer it’s connected to? Where do applications live and where do they run? Is a super-smart hard drive more or less secure than the comparatively dumb ones we have today? And, how long before a “server” is nothing more than a dongle at the end of an Ethernet cable?

But first, the technical specs. There aren’t many, because ARM no longer shares the details of its new CPU microarchitectures unless you’re a potential licensee ready to sign a big check. We do know that it’s the new range topper in Cortex-R series, which is ARM’s midrange product family designed for deeply embedded real-time applications. As such, it’s not quite as fast as the company’s fastest A-series parts. Neil Werdmuller, ARM’s Director of Storage Solutions and the R82’s symbolic father, says it’s about as fast as a Cortex-A55, if that’s any indication. All R-series parts are designed for determinism and predictable real-time response, and the R82 is no exception. There’s nothing to prevent you from designing an iPad or an Android phone around an R-series CPU, but you probably wouldn’t want to.

What’s unique about the Cortex-R82 is that it is ARM’s first and only 64-bit Cortex-R design. Up until now, almost all* ARM processors have been either 32-bit processors (mostly the early stuff) or mixed 32/64-bit processors. The R82 is 64-bit only. It doesn’t run 32-bit code. This is the first time ARM has done that. (*Cortex-A34 and -A65 are also 64-bit only.)

The reasoning, says Werdmuller, is to save silicon space. Backward compatibility with 32-bit ARM code requires transistors, and transistors take silicon, and silicon costs money. (More accurately, the area it occupies costs money; the silicon itself is virtually free.) Since the R82 is intended for disk controllers, not consumer goods with a big third-party software base, there’s no need to support legacy code. Disk manufacturers who want to reuse their existing code can simply recompile, he says. Just make sure to set the -64 switch.

Ironically, the R82 itself may reverse that situation.

Another feature that makes the R82 unique among Cortex-R processors is its full-fledged MMU (memory-management unit). This is in addition to the MPU (memory-protection unit) found on other R-series parts and is almost identical to the MMU found on the upmarket Cortex-A parts. That’s key to the R82’s aspirations, because the MMU is what allows it to run Linux and other big operating systems. In fact, Werdmuller says their R82 prototypes in the lab booted existing Linux ports with no modifications, demonstrating how similar the R82 is to its A-series brethren. ARM likes it that way. The company doesn’t want to back-port and upstream Linux for the R-series going forward. It’s much simpler if A-series and R-series cores both run the same distribution.

Linux on a disk drive? Why the $@#& would anybody want to do that? Aren’t disks supposed to be low-cost commodities with negligible differentiation? Sure, today they’re sold by the pound, but, going forward, hard disks and SSDs will be smarter and more autonomous. The whole “computational storage” paradigm says that smart disks can handle data-intensive tasks locally, offloading the host computer. Instead of shoveling tons of data back and forth over SATA or PCIe or Ethernet, why not massage the data where it lives? And that kind of rich, capable local analysis requires a rich and capable operating system running complex applications. For which read: Linux.

In a typical four-core implementation, ARM imagines one or two R82 cores running a real-time operating system to manage the disk drive itself, while the other two or three cores run Linux. The real-time part would work essentially as it does today. Linux (or some other operating system of your choice) would run on the remaining cores, isolated from the real-time activities next door. The real-time code can, and will, interrupt the non-real-time Linux code whenever it needs to, so determinism isn’t compromised. “Real-time Linux” may be an oxymoron, but it makes sense when they’re really two different operating systems running on different hardware.

Crucially, the choice of memory management – MPU versus MMU – is assigned on a per-core basis. The simpler MPU preserves real-time determinism because it doesn’t do table walks, page faults, etc. You’d enable the MPU on cores running an RTOS, and the MMU for cores running Linux.

A full operating system like Linux would allow drives to run complex applications that can mount their own filesystem and analyze files. Normally, hard drives deal with logical blocks and (in the case of spinning platters) cylinders, heads, and sectors. They don’t understand the files they’re storing, and they don’t have the capacity, as it were, to analyze their contents. Computational storage aims to change that. An entire database engine could run directly on the hard drive controller instead of the computer it’s attached to.

It also opens up the possibility of leveraging downtime. A single disk or an NAS might be dormant on evenings and weekends. The drives spin (or the SSDs sit quietly) but nobody’s asking for their data. With a smart drive, they could make use of that downtime by compressing, scrubbing, reorganizing, or analyzing their contents, all without transferring any data off-device or imposing upon a host computer. Sort of like SETI@home for storage.

Or the device could operate in “disk mode” during the day by dedicating most of its resources to real-time drive management, but then switch to “compute mode” at night. The RTOS/Linux tradeoff can change over time, just as with any multicore processor that dynamically loads and unloads tasks.

The concept starts to blur the line between computer, server, and storage. If an SSD has a controller chip that can run either Linux or an RTOS, or both simultaneously, and can host complex applications, and comes with an Ethernet connection… is it a remote storage drive, or a computer, or just a very small NAS? The necessary collection of chips – flash, RAM, Ethernet, and controller – could fit in a plastic dongle with an RJ45 jack and not much else. It’d be a bump on a cable. Change the interface from TCP/IP to NVMe and it looks like a local drive. Or its motherboard.

Of course, all these usage scenarios depend on disk makers and system integrators jumping onboard. There’s more to it than just the processor, and Cortex-R82 isn’t enabling anything that hardware makers couldn’t already do before with another processor.

One disk maker that’s never likely to use Cortex-R82 is Western Digital. That company publicly committed to RISC-V for all future products and hinted that it’s also looking at the whole computational storage idea. Linux runs on RISC-V, too, so ARM has no particular advantage there. In fact, ARM is playing catch-up. The “early access release” of Cortex-R82 isn’t until March of next year, and first customer silicon isn’t expected until late 2021, or about a year from now, according to Werdmuller. Figure another year after that for hard disks and SSDs to reach the sales channel. That makes it roughly three years from WD’s adoption ceremony to ARM’s public response, and another two years until we see R82-based products. Not exactly leading the charge.

Which way would you go: Cortex-R82, RISC-V, or something else? RISC-V has the obvious advantage of being free to use. Disks have always been exquisitely price sensitive, so saving a few pennies (maybe a lot of pennies) in royalties and amortized license fee for each disk and SSD can really add up. RISC-V also permits customization, while ARM does not, so, if you’ve got some secret CPU enhancement up your sleeve, it won’t be with an ARM.

ARM has the advantage of software compatibility and tool support. It is the world’s best-supported 32/64-bit CPU family, so nobody will question your sanity for choosing it. On the other hand, software compatibility may be moot when it comes to disk firmware. ARM itself chose to jettison compatibility with its own 32-bit binaries on the theory that nobody sells shrink-wrapped software for SSDs. If you’re writing all your own code anyway, what difference does the ISA make?

Security, reliability, and accessibility would all seem to be a wash either way. Both CPU families are available as either soft cores or hard layouts, your choice. Both are time-tested, proven architectures with no lurking bugs or horrible shortcomings. Security holes, if there are any, tend to be specific to implementation details, not congenital flaws in the architecture or ISA.

Cortex-R82 is a big step forward for disk controllers, but also a big reversal. There was a time when disk controllers were optimized for ultra-low cost because volumes were high and profit margins were skinny. Saving a penny on silicon was a big deal. ARM even created its Thumb (and later, Thumb2) code-compression schemes largely so that disk drive makers could save a few bits on code storage. Sure, it complicated the processor pipeline and the compiler tools, but the end result was less memory and lower cost. Now, Cortex-R82 goes the opposite direction. It’s a 64-bit multicore beast with no concession to code space and limited backward compatibility. It has multiple CPU cores, a choice of memory managers, big caches, and an optional Neon accelerator for neural nets, dot products, and SIMD floating-point math. And it’s for disk drives.

Maybe the next generation of SSDs will have an HDMI port and Wi-Fi, too. Who needs a computer?