Information is critical to the success of nearly every kind of business imaginable. Until recently, direct-attached storage typically provided capacity to applications running on a server. Typically, this meant one or more disk drives connected via a Small Computer System Interface (SCSI) located inside the server or connected externally to the server. Today, businesses are finding that these legacy storage architectures no longer meet their needs. In addition to a dramatic increase in the need for capacity, today's businesses may require data sharing, high performance, high availability and cost control.
Storage consolidation is one way in which the expanding needs of businesses are being addressed. Storage consolidation means centralizing and sharing storage resources among a number of application servers. Storage consolidation is often enabled by a Storage Area Network (SAN). A SAN provides high-speed connections between servers and storage units so that many servers can share capacity residing on a single storage subsystem. One drawback, however, is the cost: these storage subsystems are expensive.
Another approach to solving capacity problems is to improve performance. Disk drives are notoriously slow because they are mechanical devices, i.e., the disk has to spin, and the read/write heads have to move across the disk. Latencies are enormous in comparison to speeds at which memory accesses can occur. To address these performance issues, frequently caching is employed.
Caching is a way of speeding up access to frequently used information for faster response. A cache can be a reserved section of main memory or it can be an independent high-speed storage device.
A memory cache is a small block of high-speed memory located between the CPU and the main memory. By keeping as much frequently-accessed information as possible in high-speed memory, the processor avoids the need to access slower memory. A disk cache is a mechanism for improving the time it takes to read from or write to a hard disk. The disk cache may be part of the hard disk or it can be a specified portion of memory.
Disk caching works under the same principle as memory caching, that is, the most recently accessed data from the disk (as well as a certain number of sectors adjacent thereto) is stored in a memory buffer. When an application needs to access data from the disk, it first checks the disk cache to see if the data is there. Disk caching can dramatically improve the performance of applications, because accessing a byte of data in memory can be thousands of times faster than accessing a byte of data on a hard disk. Disk caching is typically done on a physical disk or file basis.
Typically, in a storage subsystem either all the disks in a storage subsystem are cached or they are not, and all are cached in the same way. Hence, if one client does not require caching, but another client does, client one's disk will have to be cached so that client two's disk can be cached. As another example, perhaps each of 100 clients is assigned 1/100 of the available memory space for caching but client one's data is accessed in much larger segments than is client two's data. Regardless, both clients' data will be accessed in the same way (typically by accessing a certain number of sectors or clusters).
Because disk caching is done on a physical disk or file basis, inefficient use of storage resources may result. For example, one client may be assigned disk drive one for data storage and another client may be assigned disk drive two for data storage. If client one needs more space for storage, client one can't use part of client two's available disk space, because all of disk two is assigned to client two. Another disk drive (e.g., disk three) must be assigned to client one, leading to potential inefficient use of storage resources: client two might only be using 20% of its storage capacity and client one might only be using 50% of its storage capacity (disks one and three).
Finally, caching systems are not readily tunable. In order to change the caching characteristics, the system typically must be taken down and re-initialized. Hence, disk caching can be inflexible and inefficient.
It would be helpful if caching could be based on client needs rather than on physical devices. It would also be helpful if storage usage and caching characteristics could be dynamically tuned based on client data usage patterns. The present invention addresses these needs.