The present invention relates to host caching in data storage systems.
Computer networks, distributed systems, inexpensive and powerful computers, and databases have all contributed to the general need for data storage systems having fast I/O rates. Online transactions processing (OLTP) is particularly concerned with achieving fast I/O rates. Some examples of OLTP today are e-commerce, web sites, automated teller machines, and online financial services that typically must support many users making many I/O requests on a shared pool of information.
Data storage systems store the data in an array of storage devices such as magnetic disks. However, the access time of a magnetic disk even in an array will not come close to matching the access time of volatile memories used today in data storage systems. The average memory access time of an individual magnetic disk is 5 ms, while that of a typical dynamic random access memory (DRAM) is 30–100 ns. Unlimited amounts of a fast memory such as DRAM would help achieve faster I/O rates, but cannot be provided economically. Instead, computer architects must exploit the principle of locality—temporal and spatial locality—to create the illusion of having unlimited amounts of inexpensive fast memory. Temporal locality means that recently accessed data and instructions are likely to be accessed in the near future. Spatial locality means that data and instructions with addresses close to each other tend to be accessed together.
To increase the I/O rates we must retrieve data from fast memory whenever possible. DRAM is such memory, but is much more expensive than magnetic disks on a per byte basis, so we cannot store all the data in DRAM. One solution is to provide DRAM as a cache to store the most recently used (or likely to be used) data. When a processor finds the data in the cache, called a cache hit, the data is read from the DRAM. If the processor does not find the data in cache, called a cache miss, the data must be read from the magnetic disk. As long as the cache hit ratio is high and the miss penalty small, data storage systems benefit from a caching system.
Caches can be implemented in different components of a data storage system. In many systems, caching is associated with a disk array controller. However, caching in the disk array controller is slow compared to host caching. In some host caching, the main memory includes a cache allocation and resides on the CPU-memory bus with the processor(s) permitting fast communication. However, if the host processor acts as the cache controller, it will expend host processor resources that may be busy handling all of the I/O requests. If the host is too busy managing other activities than cache management, then the memory and CPU time available for cache will be overly constrained and result in suboptimal host system performance.
Although reads dominate processor cache access, writes distinguish cache designs. Because write through requires that the data be written to cache and the storage devices before the write is acknowledged, most data storage systems primarily employ a write back cache, where the data is written to cache, and a write acknowledgment is returned prior to writing to the storage devices thus improving system performance. However, the write data in DRAM cache will be lost if there is a power interruption or failure before the modified data is written to the storage devices. A battery can preserve the data if external power is interrupted. However, it is prohibitively expensive to battery back the entire volatile memory as memories have become large. Today, 4 GB to 128 GB caches are commonly used in systems.
One disk storage subsystem includes a disk controller with a microprocessor coupled to a cache memory. A cache memory control circuit is coupled to volatile and non-volatile memory modules. In response to a write command received from a host computer, the microprocessor allocates memory blocks in the non-volatile cache memory modules. After allocation, the disk controller selects and allocates corresponding memory blocks in the volatile memory modules. Host supplied write-data is then stored in the allocated non-volatile memory module. Immediately thereafter the subsystem sends an acknowledgment signal to the host computer that the write operation is complete. A cache memory control circuit then performs a direct memory access (DMA) operation to copy the write-data from the allocated memory blocks of the non-volatile memory module to the corresponding allocated memory blocks of the volatile memory module. The write-data is then stored on a disk drive at which point the allocated memory blocks of the non-volatile memory are de-allocated and thus made available for further use.
There are several problems with this disk controller based cache design. First, there is the overhead of the allocation and de-allocation of addresses in the volatile and nonvolatile memory blocks during the write operation. Second, if the disk controller fails between the write acknowledgment and destaging, the data is only in the nonvolatile memory of the disk controller. Thus, the disk controller must be removed from service so the non-volatile memory with the only copy of the data can be physically transferred to a different disk controller, assuming one is available. The disk controller based cache also cannot retrieve data as rapidly as a host cache.