1. Field of the Invention
The present invention relates generally to storage systems and in particular relates to methods and structures for assuring reliable posting of data to disk drives in a storage system by limiting write caching in the disk drive.
2. Discussion of Related Art
As computing applications have evolved to require higher capacities, performance and reliability, so to have data storage systems evolved to provide ever increasing capacity, performance and reliability. One well established and popular configuration for storage systems provides improved capacity, performance and reliability. Some such storage systems typically utilize a plurality of independent disk drives and provide a storage controller coupled to all the disk drives. The storage controller presents an interface to a host system making the multiple disk drives appear as a single, high capacity, high performance, high reliability storage device.
Numerous storage system architectures may utilize multiple disk drives logically organized into myriad different topologies. Some storage systems hide the entirety of the storage topology from the attached host systems. Some so-called network storage appliances provide a hierarchical mapping of different levels of storage capacity and performance. Transparent to the attached host systems, data may be migrated by the storage system among the various layers of storage capacities and performance. Some storage network appliances, referred to as near line storage devices, provide archival backup storage for large capacities of data. Often the archival backup storage devices are relatively slow devices such as tape drives or optical media disk drives. To an attached host system, the near line storage system may present a similar interface to a standard disk drive—a large disk with significant reliability requirements for writing data.
Still other storage systems are often referred to by the acronym RAID (redundant array of independent disks). RAID storage management techniques operable within the storage controller of the storage system map logical locations or addresses provided by host systems into corresponding physical locations in the plurality of disk drives. Further techniques within the storage controller may distribute data provided by the host system over multiple of the plurality of disk drives to improve performance and in such a manner that failure of any single disk drive in the storage system will not cause loss of data. Rather, the storage system may continue operation (though potentially in a slower, degraded mode of operation) until such time as the failed disk drive can be replaced. Such techniques are often referred to as striping and redundancy. Several configurations and techniques are well known for RAID storage systems and are often referred to by a “level” number. RAID levels may include, for example, RAID level 0 (striped), RAID level 1 (mirrored), RAID level 5 (distributed with redundancy information), etc. The various RAID levels are well known to provide different approaches for improving performance and/or reliability of the storage system. The use of multiple disks enhances performance through striping and also serves to increase the total capacity of the storage system.
For simplicity of the discussion here, storage devices, whether disk drives, optical disk devices, tape storage devices, or otherwise, may all be referred to synonymously as “disk drives” or “storage devices” regardless of the particular storage technique utilized within. All such devices may be used in storage subsystems having a storage system controller controlling overall operation of multiple storage devices such that the storage system controller presents a simpler interface to attached host systems. The simplified interface to attached host systems generally presents the collection of multiple disk drives (multiple storage devices) as a single, high reliability storage systems. Data logically addressed to the storage system is mapped to physical locations distributed in some fashion over the multiple storage devices to improve reliability, performance, or both.
In all such storage systems, a well-known technique for improving storage system performance is the utilization of cache memory. Cache memory may be provided within the storage system controller coupled to all of the plurality of disk drives or near line storage devices in the storage system. In addition, local cache memory may be provided within the local controller of each of the plurality of storage devices within the storage system. As data is written to the storage system, the host supplied data may be entered into the cache memory of the storage system controller to allow for rapid completion of the host I/O request. Such data recorded in the storage system controller's cache memory and not yet written to the disk drive (often referred to as “dirty data”) is eventually written or posted to appropriate locations in the plurality of disk drives for persistent retention within the storage system. Use of the storage system controller's cache memory for temporarily recording information to be recorded is often referred to as “write caching” and the memory used for such purposes may be referred to as a “write cache”. In addition, as data is read from the storage devices of the storage system it often is stored within the storage system controller's cache memory such that subsequent use of the same data may be more rapidly retrieved from cache memory rather than retrieved again from the storage devices. Such use of storage system controller's cache memory is often referred to as “read caching” and the memory used for read caching may be referred to as a “read cache”. Data recorded in the write cache memory is also available for reading to more rapidly satisfy subsequent host requests for the same data.
As noted, modern disk drives also include substantial local cache memory local to the disk drive itself in addition to the above discussed cache memory in storage system controller. The local cache memory in the disk drive (typically within the disk drive controller of the disk drive) is used in communicating with a host device such as the storage controller or a host system having a storage controller integrated therein. As data is written by host system I/O requests directed to a storage system, the storage controller may store the supplied write data in its cache memory and then eventually post the information to the local cache memory of each disk drive of storing a portion of the data to be written. Since the storage controller may write information received from the host system onto multiple disk drives, each of which may include its own local cache memory used for write caching, the storage controller must provide its own sufficient write cache memory to retain information posted to each of the multiple disk drives until such time as each disk drive has verified that the information has been flushed from its own local cache memory to the corresponding locations on the persistent storage media. For example, if each of five disk drives used in a storage system has 2 MB of local cache memory that may be used to store write data, the storage controller must provide sufficient space in its cache memory for at least 10 MB of write data to be retained by the storage controller until such time as written data is known to be permanently recorded on the persistent storage media of each of the five disk drives.
As new disk drive technology has evolved, and as memory capacity and price has evolved, disk drives now possess larger and larger local cache memory structures within the disk controller of each individual disk drive. As disk drive local cache memory capacities increase, so to must the capacity for write cache memory within an associated storage controller. The memory used for local cache memory in a disk drive's disk controller is typically volatile RAM memory due to lower cost associated with a single storage device. By contrast, a storage controller in a storage system typically utilizes nonvolatile memory (“NVRAM”) for its storage controller cache memory so as to prevent data loss due to power failure or other interruption. Nonvolatile memory can be substantially higher cost than volatile memories of the same capacity. It is therefore a problem for storage controller designers to effectively manage ever increasing local cache memory sizes associated with individual disk drives controlled by the storage controller. If data is forwarded from the storage system controller to multiple disk drives in a write operation, the storage controller must retain the data in its nonvolatile write cache memory until it verifies that the written data is posted to the persistent storage media of all related disk drives. The write cache of the storage controller would therefore have to be at least as large as the total size of all the write cache portions within each of the disk drive controllers. Otherwise, data may be sent to a disk drive but with no room to retain write data in the storage controller's write cache to assure that the write data is properly posted to the persistent media of the disk drives. Since the size of the local cache memory in disk drives may vary dramatically over time as disk drives evolve and may vary between vendors, storage controller designers are confronted with a constant problem of assuring that write data is never lost—especially in high reliability RAID applications.
One common technique to generally address the above discussed problem is to force frequent flushing of all write cache memories (within the storage controller and within each local cache memory of each disk drive coupled thereto) so as to assure all data is posted to the persistent storage of the disk drive of media. Another common approach utilizes special commands to force bypass of use of the local cache memory in the disk drive (e.g., a forced unit access or FUA SCSI command as typically supported by SCSI disk drives to bypass cache memory within the disk drive unit). Both of these known approaches impose performance penalties on the storage system either by never using disk drive cache memory in the case of bypass commands or by frequently forcing flushes of disk drive cache memory. In either case the disk drive persistent media may be more frequently accessed and thereby degrade performance of the storage subsystem.
To ensure that data is reliably recorded on the persistent media of the storage devices, the storage system controller keeps data stored in its non-volatile memory write cache until it can verify that the data has been flushed from the local cache memory of the individual storage devices. However, as the size of local cache memories in the disk drives increase so too must the size of the corresponding write cache memory in the storage system controller (or other host device). The storage controller designer is therefore faced with an ever changing design problem to attempt to match the size of the storage controller's cache to that of the various disk drive local cache memories or to degrade performance of the storage system by assuring that data is rapidly posted to the persistent storage media of the disk drives.
It is evident from the above discussion that a need exists for improved cache memory management in the context of storage controllers coupled to disk drives having a local cache memory.