In computing architectures that use externally attached storage such as Network Attached Storage (NAS) or Storage Area Networks (SANs), there is a growing mismatch between the increasing speed of compute servers and the ability of these storage systems to deliver data in a timely fashion. This inability for storage systems to keep up with fast compute servers causes applications to stall and overall throughput of the system to plateau or even regress under significant load.
When looking more closely at the root causes of this scalability problem, one common factor is latency of fetching data from a disk drive, in particular the rotation and seek time. While drives can deliver large contiguous amounts of data with an initial latency of 1-5 ms in seek time (moving the drive heads to the correct location on disk) frequent access to non-contiguous data can be of the order of ˜40 ms per access. For datasets that involve a lot of randomly accessed data (such as relational databases), the drive seek time becomes a major bottleneck in delivering data in a timely fashion.
Traditional attempts to solve this problem include adding a hierarchy of RAM-based data caches in the data path. This conventional approach is illustrated in FIG. 1. As shown in FIG. 1, when a compute server 110 attempts to access data from storage system 102 via a network 120, there are typically at least three different caches in the overall data path. A hard drive data cache 108 provides about 8 Mbytes of cache, a storage system cache 106 provides between about 128 Mbytes and 16 Gbytes, and a compute server data cache 112 provides between about 100 M and 2 Gbytes (typical lightly loaded system).
While such caches are generally beneficial, certain drawbacks remain. For example, the performance problems mentioned above still occur when the active data set is being accessed randomly or is too large to fit into the caches normally present.
There have been a number of companies that have created caching products which try to attack this problem through custom hardware solutions. Examples of this include RAMSAN from Texas Memory Systems (http://www.superssd.com/default.asp) and e and n-series products from Solid Data (http://www.soliddata.com/). These products are inadequate because they rely on solid-state disk technology which tends to be both expensive and limited in maximum storage size.