Storage systems may store information that may be used by an entity, such as an application or a user. Examples of such storage systems include computer hard disk drives (HDDs), solid state drives (SSDs), flash memory, random access memory (RAM), read only memory (ROM), magnetic storage, and other types of non-transitory mediums capable of storing information.
The entity may send input/output requests, such as requests to read data from, or write data to, the storage system. The storage system may receive the request and act upon the request by (for example) reading the data associated with the request and returning the data to the requesting entity, or writing data to the storage system as requested by the entity.
However, not all of the requests have the same priority. For example, some entities may request high-priority data that the entity must access within a certain period of time (e.g., 10 milliseconds). Such entities can tolerate only a small amount of latency before the storage system returns the requested data. Other entities may be willing to tolerate a certain amount of waiting before the data is received. For example, an entity may submit a low-priority bulk request for a large amount of data which can be serviced in any (reasonable) amount of time.
Requests for data that should be completed within a certain (usually relatively short) period of time are called “latency-sensitive requests.” Other requests for data which are not associated with a time limit (or which are associated with a relatively longer time limit) are called “non-latency-sensitive requests.”
Latency-sensitive requests and non-latency-sensitive requests may be received by the storage system in any order, at any time. Problematically, in most storage systems a request for data cannot be preempted once the storage system begins to service the request and retrieve the data. Therefore, it can be difficult to schedule data requests. For example, if the storage system begins to service a non-latency-sensitive bulk request that takes 500 milliseconds to complete, and 10 milliseconds later receives a latency-sensitive request that must be completed in 15 milliseconds, it will not be possible to complete the latency-sensitive request in time.