“But it’s all right. I’m Jumpin’ Jack Flash. It’s a gas, gas, gas.” – Keith Richards/Mick Jagger
A new hard disk controller. Ho-hum, yawn. But this one says it improves performance tenfold. Huh? How is that possible? And that it increases price/performance by 90%. This also seemed implausible. Time to dig into these nonsense claims.
Turns out, they’re correct. A new storage controller from a startup company called Pliops does, indeed, deliver a big speedup compared to normal plug-in cards connected to hard drives or SSDs (solid state disks). That’s mostly because Pliops treats flash memory like it’s a whole new storage medium, not just a faster version of a hard drive.
There’s no question that flash-based SSDs are popular in everything from home gaming PCs to humongous cloud servers. They’re fast, they’re small, they run cool, and you can drop them on the floor. Their biggest downside is that they’re expensive compared to Ye Olde Platter of Spinning Rust. But if cost is no object and performance is what you’re after, SSDs are the way to go.
Hard drive makers have even conveniently packaged their SSDs to look and act just like little hard drives, which makes the conversion that much easier. Just unplug your mechanical drive, replace it with the SSD, reboot, and you’re in business. Instant gratification. But that simplicity comes with a cost, says Pliops. Flash memory doesn’t really behave like a sector on a hard drive platter, so hardware interfaces and software drivers have to jump through hoops to hide those differences and make flash look like a disk. Compatibility is convenient, but you’ve left performance on the table.
What your flash storage really needs is a whole new paradigm for reading, writing, and allocating storage. Trouble is, that would mean rewriting every operating system, database, and storage driver on the planet, and nobody wants to do that. Oh, sure, we’ve got tweaked filesystem drivers that accommodate flash storage. But what Pliops is promoting is not just an updated flash-compatible driver, but a complete rethink. And new hardware.
The company calls its PCIe card a “storage processor” for flash memory. It looks and feels like any PCIe controller card but instead of just interface components and buffers, it’s a whole FPGA-based programmable engine. It has its own APIs and a considerable amount of custom logic. The idea is to offload storage tasks from the host processor (whether PC or cloud datacenter) and handle things its own way.
Part of the problem is that flash memories are always partitioned into fixed-size blocks, but these block sizes have no correlation with the way data is structured or accessed in most applications. You end up writing out much bigger blocks than necessary, a wasteful effect called “space amplification.” This also leads to inefficient read or write transactions, where the database application or filesystem driver requests much more data than necessary, or when it performs elaborate read-erase-modify-write cycles to store a small amount of data buried within a larger block. Finally, erase/write cycles are slow, and they block reading from the same device.
What Pliops’s card does is to take over compression, packing, sorting, merging, indexing, garbage collection, and the low-level functions of reading and writing to SSDs. Its activities are (mostly) transparent to the operating system and to database programs running on the host. The card also has a higher-performance option that requires a bit more software cooperation.
Popular database programs like MySQL, MongoDB, or Ceph are all built upon an underlying layer of “storage engine” code such as InnoDB, RocksDB, WiredTiger, or LevelDB. It’s this lower-level code that talks to the storage medium (whether disks or SSDs or something else), and each one has its own ideas about how to trade off storage efficiency, speed, adaptability, memory usage, CPU workload, and so forth. A lot of academic research and open-source code has been dedicated to reevaluating and rebalancing those tradeoffs to get different performance profiles. Everyone has their favorite.
What Pliops does is to replace the low-level code with its own hardware. The visible database (e.g., MySQL) uses the same APIs as before, but the InnoDB layer below it is replaced by Pliops’s own data structures, buffers, indexing, and other low-level functions. The database is none the wiser, and the applications above it are completely unaware, but the whole system runs faster and uses flash storage more efficiently, according to Pliops. The board also has a key-value (KV) API for storage engines like RocksDB and LevelDB that work in that mode.
There’s also an “analytics” API that is unique to Pliops. The idea here is that developers can ask the interface card for insight into how its database is performing, rather than tasking the host CPU with requesting the low-level data.
In the beginning, there was core memory. That got replaced by magnetic tape, then by big 14-inch platters spinning inside washing machines (or so it seemed), before those shrunk to 8 inches, 5.25 inches, 3.5 inches, and even smaller. Now we’ve got flash memories masquerading as hard disks, mostly because we’ve developed a decades-long attachment to the interface. Backward compatibility is a wonderful thing, but it can also hold us back. Pliops says it’s time to stop pretending our SSDs are just fast hard disks and take advantage of the benefits they offer. They’ll even sell you a card to make it easy.