Part 2 …The Drive
Over time, the smarts in storage have migrated back and forth between the drive and the host system. Behind this shifting picture are two key factors. First, the intelligence of a micro-controller chip determines what a drive can do, while secondly, the need to correct media errors establishes what a drive must do.
Once SCSI hit the market, the functionality split between host and drive essentially froze and continued so for nearly 3 decades. The advent of new error-correction needs for SSDs, combined with the arrival of ARM CPUs that are both cheap and powerful, making function-shifting once again interesting.
Certainly, some of the new compute power goes to sophisticated multi-tier error correction to compensate for the wear out of QLC drives or the effects of media variations, but a 4-core or 8-core
ARM still has a lot of unused capability. We’ve struggled to figure out how to use that power for meaningful storage functions and that’s led to a number of early initiatives.
The first to bat was Seagate’s Kinetic drive. Making a play for storing “Big Data” in a more native form, Kinetic adds a key/data store to its interface, replacing the traditional block access altogether. While the Kinetic interface is an open standard and free to emulate, no other vendor has yet jumped on the bandwagon and Seagate’s sales are small.
WD’s HGST unit then introduced an Ethernet drive with a built-in processor running Linux and capable of hosting storage apps such as NAS. Another unit, WD Labs, demonstrated a Ceph setup with 500+ drives each kitted out with an Ethernet front-end and CEPH OSD software.
These are all three-year-old efforts and, frankly, have quietly disappeared from the market. Why did that happen? For one thing, these were all rifle-shots … one type of drive and not a roadmap of a family over time. Next, they were all hard drives, which is hardly exciting in a world of blazingly fast SSDs.
The killer, though, is that each approach attempted to take functions typically done across sets of drives and jammed them many time over into single drives. Where did data redundancy such as RAID or erasure-coding live? What about replication in Ceph? How is data distributed in a key/data store?
The whole debate about where software functions reside led to the evolution of the Software-Defined-Storage concept. In this scheme, common storage services reside in containers on the servers in the cluster, while drives themselves are near bare metal. This would mean, for example, that CEPH OSDs would be server-side instances and the drives would only look at media-level error correction and address virtualization, the functions that are only particular to the drive on which they run.
Microsoft recently opened up about Project Denali, which implements this SDS architecture. This is still in its gestation phase, so don’t expect product yet, since the drive infrastructure, switching and networking all have to evolve in support. Even so, the benefits of the SDS model will be that integration of new drives will be rapid, since the interface is simple and standardized. Given the direction of SSD software stacks towards complexity and proprietary features (lock-ins), this is an important win.
The SDS model is a great leap forward, but the pace of core technology evolution in storage is incredible today. The advent this year of byte-addressable persistent storage throws a huge spanner in the works. This is such a revolutionary concept that all apps would need to be rewritten, as well as compiler, link editor and OS code. These are major projects and will take several years to mature.
One dilemma is that byte-addressable storage fits the NVDIMM/memory bus paradigm, but is vastly different from today’s SSDs. Conceivably; SSDs could support byte-addressability through RDMA with NVMe over Fabrics, while supporting block IO by setting transfer lengths to 4KB or 512B, giving the best of all worlds.
A challenge for both NVDIMM and SSD modes is the efficient packing of data, especially with data compression. This will move us way beyond the current file system structures. Database-like file systems are an appealing alternative here. Other issues include data integrity in byte mdoe, efficient use of extents to recover empty space and so on. No small challenge!
On the SDS data services side, expect accelerators and GPUs to parallelize processing and so speed up erasure coding, encryption compression etc. Note that this SDS view works well with next-gen architectures for servers, such as Gen-Z.
We are looking at an incredible period of innovation in storage over the next decade. It should be fun!