Large storage systems such as data centers and cloud storage providers typically store and retrieve large quantities of data for multiple different customers and applications. Different applications and different customers may have different requirements for how quickly their data is stored or retrieved. These requirements typically are in the form of Service Level Agreements (SLAs) between the customer and the storage provider. Service Level Agreements typically define a minimum Quality of Service (QoS) for any particular customer, application, or individual data stream.
The Quality of Service comprises the performance properties for data reads and writes. For isolated write or read traffic, the parameters are latency and throughput. For mixed workloads, it is measured in Input/Output Operations per Second (IOPS).
Data networks use SLAs and QoS to provide a known, measured, and guaranteed level of service when handling multiple traffic streams. Data centers and cloud service providers may also benefit by applying SLAs and QoS to additional types of shared resources such as computing power using virtual servers and shared storage using multiple large storage devices and arrays.
Service Level Agreements for data networks typically include delay, jitter, and packet loss metrics on a per connection basis. This translates into a bandwidth and latency allocation per connection through the routers in the path. SLAs for storage devices include bandwidth, latency, and IOPS for write, read, and mixed write/read traffic.
In order to maintain acceptable service levels for shared customers, it is necessary to measure the performance to ensure that it is always acceptable. The method of performance measurement is QoS. By measuring QoS, service providers can ensure that customers receive adequate service and determine if and when their infrastructure may need to be upgraded to maintain desired service levels.
Newer data center and cloud services infrastructures are relying more and more on virtualized services. These data centers have large server farms. The servers run multiple Virtual Machines (VMs). A VM has data storage allocated to it. The data storage can be in many forms from local (high speed RAM, Local Bus NV Storage—usually PCIe SSD, local HDD) to remotely shared over a LAN or SAN (All Flash Array, Hybrid Flash & HDD, HDD Array).
In this environment, an application or user is allocated network bandwidth, processor bandwidth, and storage size. Until recently, with the advent of higher speed non-volatile memory such as flash, the storage bandwidth was not that important due to the large mismatch in performance between CPU and HDD bandwidth. High speed applications required more local cache.
Having an infrastructure that is capable of provisioning resources at different SLAs that are continuously measured against their QoS parameters is becoming very important in newer data center and cloud services architectures that support multi-tenant shared storage.
With the large variation in storage performance and cost that now exists, it is desirable to assign an SLA to the storage portion of the services.
All Flash Array (AFA) systems provide QoS and SLA features. These are normally provisioned at the array ingress across host IDs. Within a host ID, they can be further provisioned via namespaces and Logical Block Addressing (LBA) ranges associated to specific Virtual Machines (VMs), containers, and applications.
AFAs are able to segregate the user traffic and manage the traffic to meet SLA goals as long as the bandwidth and latency is well under-subscribed. AFAs cannot deterministically meet SLA goals in near full or over-subscribed situations due to their use of Solid State Drives (SSDs).
SSDs require background operations such as garbage collection, wear leveling, and scrubbing that make their performance and latencies indeterminate. Attempts have been made to reduce these effects by having the host regulate the amount of background activities allowed in the SSD. However, if a system is reaching capacity and performance limits, the SSD must perform background tasks in order to free up space—which eliminates the determinism at the worst possible time.
There are many other use models that suffer from SSD performance variations. Any high-performance application that depends on consistent SSD storage performance has this issue. Also, any system that has multiple applications, VMs, and/or hosts sharing the same storage has this problem.