Enmotus Blog

A New Age Storage Stack

Posted by Jim O'Reilly on Jun 4, 2018 3:42:54 PM

For over three decades, we’ve lived with a boring truth. Disk drive performance was stuck in a rut, only doubling over all that time. One consequence was that storage architecture became frozen, with little real innovation. RAID added a boost, but at a high price. In fact, we didn’t get a break until SSDs arrived on the scene.

SSDs really upset the applecart. Per drive performance increased 1000X in just a few years and all bets were off at that point. Little did we realize that the potential of SSDs reached into stratospheric levels of millions of IOPS per drive.

All of this performance broke the standard SCSI model of the storage stack in the operating system. An interrupt-driven, verbose stack with up to seven levels of address translation just doesn’t cut the I/O rate needed. The answer is the NVMe stack, which consolidates I/O’s and interrupts efficiently and uses the power of RDMA to reduce round-trip counts and overhead dramatically. IOPS rates in excess of 20M IOPS have been demonstrated and there is still room to speed up the protocol.

Read More

Topics: NVMe, autotiering, hyperconverged, NVMe over Fibre, enmotus, data analytics, NVDIMM

A.I. For Storage

Posted by Jim O'Reilly on Dec 18, 2017 2:12:46 PM

As we saw in the previous part of this two-part series, “Storage for A.I.”, the performance demands of A.I. will combine with technical advances in non-volatile memory to dramatically increase performance and scale within the storage pool and also move addressing of data to a much finer granularity, the byte level rather than 4KB block. This all creates a manageability challenge that must be resolved if we are to attain the potential of A.I. systems (and next-gen computing in general).

Simply put, storage is getting complex and will become ever more so as we expand the size and use of Big Data. Rapid and agile monetization of data will be the mantra of the next decade. Consequentially, the IT industry is starting to look for ways to migrate from today’s essentially manual storage management paradigms to emulate and exceed the automation of control demonstrated in 

public clouds.

Read More

Topics: NVMe, Data Center, NVMe over Fibre, enmotus, data analytics, NVDIMM, artificial intelligence

Optimizing Dataflow in Next-Gen Clusters

Posted by Jim O'Reilly on Sep 6, 2017 10:57:55 AM

We are on the edge of some dramatic changes in computing infrastructure. New packaging methods, ultra-dense SSDs and high core counts will change what a cluster looks like. Can you imagine a 1U box having 60 cores and a raw SSD capacity of 1 petabyte? What about drives using 25GbE interfaces (with RDMA and NVMe over Fabrics), accessed by any server in the cluster?

Consider Intel’s new “ruler” drive, the P4500 (shown below with a concept server). It’s easy to see 32 to 40 TB of capacity per drive, which means that the 32 drives in their

concept storage appliance give a petabyte of raw capacity (and over 5PB compressed). It’s a relatively easy step to see those two controllers replaced by ARM-based data movers which reduce system overhead dramatically and boost performance nearer to available drive performance, but the likely next step is to replace the ARM units with merchant class GbE switches and talk directly to the drives.

I can imagine a few of these units at the top of each rack with a bunch of 25/50 GbE links to physically compact, but powerful, servers (2 or 4 per rack U) which use NVDIMM as close-in persistent memory.

The clear benefit is that admins can react to the changing needs of the cluster for performance and bulk storage independently of the compute horsepower deployed. This is very important as storage moves from low-capacity structured to huge capacity big-data unstructured.

Read More

Topics: All Flash Array, Intel Optane, Data Center, NVMe over Fibre, data analytics

Content driven tiering using storage analytics

Posted by Adam Zagorski on Aug 9, 2017 10:05:00 AM

IT has used auto-tiering for years as a way to move data from expensive fast storage to cheaper and slower secondary bulk storage. The approach was at best a crude approximation, being only able to distinguish between objects on the basis of age or lack of use. This meant, for instance, that documents and files stayed much longer in expensive storage than was warranted. There simply was no mechanism for sending such files automatically to cheap storage.

Now, to make life even more complicated, we’ve added a new tier of storage at each end of the food chain. At the fast end, we now have ultra-fast NVDIMM offering an even more expensive and, more importantly space limited, way to boost access speed, while at the other end of the spectrum the cloud is reducing the need for in-house long-term storage even more. Simple auto-tiering doesn’t do enough to optimize the spectrum of storage in a 4-state system like this. We need to get much savvier about where we keep things.

The successor to auto-tiering has to take into account traffic patterns for objects and plan their lifecycle accordingly. For example, a Word document may be stored as a fully editable file in today’s solutions, but the reality is that most of these documents, once fully edited, become read-only objects moved in their entirety to be read. If changes occur, a new, renamed, version of the document is created and the old one kept intact.

Read More

Topics: autotiering, big data, Data Center, NVMe over Fibre, enmotus, data analytics

How To Prevent Over-Provisioning - Dynamically Match Workloads With Storage Resources

Posted by Adam Zagorski on Jun 25, 2017 10:05:00 AM

The Greek philosopher Heraclitus said, “The only thing that is constant is change.” This adage rings true today in most modern datacenters. The demands on workloads tend to be unpredictable, which creates constant change. At any given point in time, an application can have very few demands placed on it, and at a moment notice the workload demands spike. Satisfying the fluctuations in demand is a serious challenge for datacenters. Solving this challenge will translate to significant cost savings amounting to millions of dollars for data centers.

Traditionally, data centers have thrown more hardware at this problem. Ultimately, they over provision to make sure they have enough performance to satisfy peak periods of demand. This includes scaling out with more and more servers filled with hard drives, quite often short stroking the hard drives to minimize latency. While hard drive costs are reasonable, this massive scale out increases power, cooling and management costs. The figure below shows an example of the disparity between capacity requirements and performance requirements. Achieving capacity goals with HDDs is quite easy, but given that individual high performance HDDs are only able to achieve about 200 random IOPS, it takes quite a few HDDs to meet performance goals of modern database applications.

Today, storage companies are pushing all flash arrays as the solution to this challenge. This addresses both the performance issue as well as the power and cooling, but now massive amounts of non-active (cold) data are stored on your most expensive storage media. In addition, not all applications need flash performance. Adding all flash is just another form of overprovisioning with a significantly higher cost penalty.

Read More

Topics: NVMe, autotiering, big data, All Flash Array, SSD, Data Center, NVMe over Fibre, data analytics

Storage analytics impact performance and scaling

Posted by Jim O'Reilly on Jun 14, 2017 11:21:10 AM

For the last 3 decades of computer storage use, we’ve operated essentially blindfolded. What we’ve known about performance has been gleaned from artificial benchmarks such as IOMeter and guestimates of IOPS requirements during operations that depend on a sense of how fast an application is running.

The result is something like steering a car without a speedometer ... it’s a mess of close calls and inefficient operations.

On the whole, though, we muddled through. That’s no longer adequate in the storage New Age. Storage performance is stellar in comparison to those early days, with SSDs changing the level of IOPS per drive by a factor of as much as 1000X. Wait, you say, tons of IOPS…why do we have problems?

The issue is that we share much of our data across clusters of systems, while the IO demand of any given server has jumped up in response to virtualization, containers and the horsepower of the latest CPUs. In fact, that huge jump in data moving around between nodes makes driving blind impossible even for small virtualized clusters, never mind scaled-out clouds.

All of this is happening against a background of application-based resilience. System uptime is no longer measured in how long a server runs. The key measurement is how long an app runs properly. Orchestrated virtual systems recover from server failures quite quickly. The app is restarted on another instance in a different server.

Read More

Topics: All Flash Array, Data Center, hyperconverged, NVMe over Fibre, data analytics

Storage Automation In Next Generation Data Centers

Posted by Adam Zagorski on Jan 31, 2017 1:04:37 PM

Automation of device management and performance monitoring analytics are necessary to control costs of web scale data centers, especially as most organizations continually ask for their employees to do more with fewer resources.

Big Data and massive data growth are at the forefront of datacenter growth. Imagine what it takes to manage the datacenters that provide us with this information.

 

According to research conducted by Seagate, time consuming drive management activities represent the largest storage related pain points for datacenter managers. In addition to trying to manage potential failures of all of the disk drives, managers must monitor the performance of multiple servers as well. As indicated by Seagate, there are tremendous opportunities in cost savings if the timing of retiring disk drives can be optimized. Significant savings can also result from streamlining the management process.

 

While there is no such thing as a typical datacenter, for the purpose of discussion, we will assume that a typical micro-datacenter contains about 10,000 servers while a large scale data center contains on the order of 100,000 servers. In a webscale hyperconverged environment, if each server housed 15 devices (hard drives and/or flash drives), a datacenter contains anywhere from 150,000 to 1.5 million devices. That is an enormous amount of servers and devices to manage. Even if we scaled back by an order of a magnitude or two, to 50 servers and 750 drives for example, managing a data center is a daunting task.

 

Read More

Topics: NVMe, big data, All Flash Array, hyperconverged, NVMe over Fibre

Storage Visions 2017

Posted by Jim O'Reilly on Jan 18, 2017 2:22:42 PM

Here it is. A new year opens up in front of us. This one is going to be lively and storage is no exception. In fact, 2017 should see some real fireworks as we break away from old approaches and move on to some new technologies and software.

Read More

Topics: NVMe, SSD, Data Center, data anlytics, NVMe over Fibre

Hot Trends In Storage

Posted by Adam Zagorski on Dec 13, 2016 2:02:41 PM

Storage continues to be a volatile segment of IT. Hot areas trending in the news this month include NVMe over Fibre Channel, which is being hyped heavily now that the Broadcom acquisition of Brocade is a done deal. Another hot segment is the hyper-converged space, complimented by activity in software-defined storage from several vendors.

Flash is now running ahead of enterprise hard drives in the market, contributing to foundry changeovers to 3D NAND to temporarily put upward pressure on SSD pricing. High-performance storage solutions built on COTS platforms have been announced, too, which will create more pressure to reduce appliance prices.

Let’s cover these topics and more in detail:

  1. NVMe over Fibre-Channel is in full hype mode right now. This solution is a major step away from traditional FC insofar as it no longer encapsulates the SCSI block-IO protocol. Instead, it uses a now-standard direct-memory access approach to reduce overhead and speed up performance significantly.
Read More

Topics: NVMe, SSD, hyperconverged, NVMe over Fibre

Delivering Data Faster

Accelerating cloud, enterprise and high performance computing

Enmotus FuzeDrive accelerates your hot data when you need it, stores it on cost effective media when you don't, and does it all automatically so you don't have to.

 

  • Visual performance monitoring
  • Graphical managment interface
  • Best in class performance/capacity

Subscribe to Email Updates

Recent Posts