For years data has typically been stored in fixed-size sectors or disk blocks on hard disk drives (HDDs) having spinning platters with a number of concentric tracks. When data is grouped together in contiguous blocks and tracks it can be accessed relatively quickly, but, if the data is accessed from blocks or tracks at different locations on the disk, access will be much slower, due to relatively large head movements. This can reduce read and write performance significantly (e.g., by a factor of a hundred).
Traditionally application software ran on physical servers, often with a single application per server. Each server would have its own set of disks, so often a single application would have its own set of disks. More recently, however, many servers are virtualized, with clusters of physical servers running multiple virtual servers, each running applications that share disk drives or arrays of disk drives.
When multiple applications are accessing multiple files, these files are likely to be scattered across the disk drives. This means that in a virtualized environment application performance suffers due to the poor performance of the disks as a result of the head movements.
While disk capacity continues to grow exponentially and costs continue to decline, disk random performance has changed very little. Therefore, although large arrays of HDDs may not be required for capacity, they are still necessary to deliver the read performance required by applications or desired by users.
Solid-state storage devices (SSDs) are capable of delivering much higher performance than HDDs, but cost considerably more. In addition, the amount of memory available in servers has increased significantly, so working datasets can be read once from the disk drive and then stored in memory, reducing the number of disk reads.
The lowering of the cost of SSDs and memory allows for hybrid storage systems, using HDDs and the like for capacity but storing frequently accessed data in memory or on SSD. Identifying which data should be stored where, however, is challenging.
Accordingly, there is a need for systems and/or devices with more efficient, accurate, and effective methods for storing data on shared storage systems. Such systems, devices, and methods optionally complement or replace conventional systems, devices, and methods for caching data.