1. Field of the Invention
This invention generally relates to digital data storage systems and more specifically to digital data storage systems that provide redundant storage by mirroring data.
2. Description of Related Art
Many approaches have been developed for protecting critical data stored in a digital data system against loss resulting from power failures or transients, equipment malfunctions and other causes. In one approach normal operations on a data processing system stop so that all of or selected portions of the stored data can be transferred to tape or other backup media thereby to backup the memory system by providing a "snapshot" of the memory system at the time of the backup. Each successive backup may then either copy onto the backup media the data in the entire system or only the data or files that changed since a prior backup. This approach is still used in many data processing systems. However, even in personal computer systems, the time to complete such a backup may require an hour or more. It may also take a significant time to restore the information, particularly if a storage system, such as a disk drive, fails completely.
While such approaches may be acceptable for providing redundancy in home and small office systems, in recent years there has arisen another category of data processing system that requires essentially full-time availability and that incorporates large memory systems. Such data storage systems often include plural disk controllers, each of which controls multiple disk drives or other storage systems. Conventional backup procedures simply can not be used with such systems without significant interruptions that can lead to unacceptable intervals during which the data processing system is not available for its normal operations.
In some such prior art systems files are written to a specific disk drive, as a primary disk drive, through its corresponding disk controller. Additionally the writing controls are modified to write the file to another disk, as a secondary disk drive, connected to the same or another disk controller. This provides full redundancy. However, the data processing system must perform two writing operations sequentially. Sequential writing operations can affect the operation of the data processing system. Each copy is stored randomly on each disk and can even become fragmented. This can produce intolerably long retrieval times. Moreover, in such systems all normal reading operations involve the primary disk drive. No attempt is made to read from the secondary disk drive unless a problem occurs in the primary disk drive.
U.S. Pat. No. 5,390,313 issued to Yanai et al., and assigned to the assignee of this invention, discloses an approach for providing data redundancy. The system includes at least one pair of disk storage devices. Each device has a plurality of generally identical data records. These are "mirrored" disks or storage media. Each medium includes position indicators for providing one or more indications of rotational position of each of the rotating data storage media with respect to its associated fixed position read/write mechanism. A position monitor receives the rotational position indications from each rotating data storage medium and computes and monitors the rotational position of each rotating storage medium with respect to its associated read/write mechanism. After receiving a request for access to one or more data records stored on the mirrored pair of rotating data storage media, the system computes projected data access times for retrieving the requested data record on each of the rotating data storage media and commands retrieval of the requested data record to the rotating data storage medium having the shortest projected data access time based upon rotational position in state of the respective data storage medium. Consequently unlike the previously discussed file copy systems, data can be read from either of the mirrored memories.
U.S. Pat. No. 5,212,784 issued to Sparks discloses another type of automated backup system in which separate logical buses couple a primary controller to a set of paired mirrored memories or shadowed primary data storage devices. A backup device controller attaches to one of the logical buses and a backup device. The primary controller writes data to both the primary data storage devices to produce mirrored copies. In a backup mode, the backup device controller transfers data that it reads from a designated one of the primary data storage devices to the backup storage device. After the backup is complete, the primary controller re-synchronizes the primary data storage devices so that data that has been written on the continuously operational data storage device is copied onto the designated data storage device. In an alternative embodiment separate logical buses couple the primary controller to at least a set of triplet or quadruplet mirrored or shadowed primary data storage devices. Triplet devices permit backup operation while retaining the redundancy characteristic of the mirrored storage devices. Quadruplet devices permit continuous backup operations of two alternating storage devices retaining the redundancy characteristic of mirrored storage devices.
U.S. Pat. No. 5,423,046 issued to Nunnelley et al. discloses a high capacity data storage system with a large array of small disk files. Three storage managers control (1) the allocation of data to the array, (2) access to the data and (3) the power status of disk files within the disk array. More specifically, the allocation manager controls, inter alia, the type of protection desired to include redundancy by mirroring. The access manager interprets incoming read requests to determine the location of the stored data. That is, the access manager determines which cluster or clusters in the data memories contain the requested data set and then passes that cluster list to the power manager. The power manager determines which disk files must be activated to fulfill the request.
U.S. Pat. No. 5,392,244 issued to Jacobson et al. discloses memory systems with data storage redundancy utilizing both mirroring and parity redundancy. The memory system places more critical data in the mirrored areas and less frequently accessed data in the parity area. Consequently the system effectively tunes the storage resources of the memory system according to the application or user requirements. Alternatively the tuning can be made on the basis of accesses to the data such that the mirrored areas store recently accessed data while the parity raid area stores the remaining data.
U.S. Pat. No. 5,432,922 issued to Polyzois et al. discloses a storage system using a process of alternating deferred updating of mirrored storage disks. Data blocks or pages to be written are accumulated and sorted into an order for writing on the disk efficiently. The individual disks of a mirrored pair are operated out of phase with each other so that while one disk is in the read mode the other is in the write mode. Updated blocks are written out to the disk that is in the write mode in sorted order. Read performance is provided by directing all read operations to the other disk, that is in the read mode. When a batch of updates has been applied to one disk of a mirrored pair, the mirrored disks switch their modes and the other disk, that had been in the read mode is updated.
U.S. Pat. No. 5,435,004 issued to Cox et al. discloses yet another redundant storage variant. A computerized data backup system dynamically preserves a consistent state of primary data stored in a logical volume of a disk volume management system. A file system command invokes a cloning of the logical volume thereby reserving a portion for shadow-paged blocks. A read/write translation map establishes a correspondence between unshadowed and shadowed pages in a reserved portion. Upon generating a read command for a page in a logical volume, a map search detects that a shadowed page is allocated to the shadowed page blocks corresponding to the page and effects the read. Backup occurs while the system is operating thereby facilitating reading from the non-shadow page blocks during such a backup.
In still another system, that has been utilized by the assignee of this invention, each of two mirrored individual disk drives, as physical disk volumes, are divided into blocks of consecutive tracks in order. Typically the number of tracks in each block is fixed and is not dependent upon any boundary with respect to any file or data stored on the blocks. A typical block size might include four tracks. Assume for explanation that the blocks were numbered consecutively: (i.e. 0,1,2, . . . ), block 0 would comprise tracks 0 through 3; block 1, tracks 4 through 7; etc. During each reading operation, the data system reads all data from odd-numbered blocks (i.e., blocks 1,3 . . . ) from the first mirrored physical disk drive and all the even-numbered blocks (i.e., blocks 0,2,4 . . . ) from the second mirrored physical disk drive.
However, when a read operation in the foregoing system recovers a data block that resides on consecutive blocks of tracks, for example, track blocks 1 and 2, the reading operation from the first physical disk drive must stop at track 7. Then the second disk drive must move its head to the appropriate track, track 8 on this example, to retrieve the next block. This interval, or "seek time", and a corresponding "latency", that represents the time required for the beginning of a track to reach a read/write head, determines the total access time. Whereas continuing the reading operation with the first disk drive might introduce a one-track seek time and one-revolution latency, switching to the second drive could involve an increase to a full maximum seek time and up to a one-revolution latency. Such a total access time will interrupt the transfer and can significantly affect the overall rate at which data transfers from the physical disk drives in some applications.
Other mirrored disk systems have used a "nearest server" algorithm to select one of the mirrored drives. In such a system, each read command initiates a process that determines which of two mirrored drives will be available first to begin a reading operation. The process can use any or all of several parameters, such as current head position to determine relative seek times, or whether one of the mirrored storage systems is then involved in some operation. This process is repeated for each read command.
Collectively the foregoing prior art discloses various approaches for minimizing the risk of data loss in a data processing system, particularly through the use of mirrored memory devices. This prior art also discloses various approaches for enabling reading operations from both physical disk drives in a mirrored pair. However, in these systems the decision on which of the mirrored pair will be used during a reading operation rests generally on the physical attributes of the disk drive rather than the data content of the drive. For example, the assignee's prior art system divides the physical drive into arbitrary blocks of contiguous disk tracks and then interleaves the reading operations according to the location of the data on a particular track. Another of assignee's system selects a particular one of the mirrored physical disk pairs based upon the time it will take to initiate an actual transfer. Still others make a determination based upon whether one or the other of the mirrored disk pair is involved in a backup operation or in a writing operation such that a reading or backup operation with one physical disk in the mirrored pairs causes the reading operation to occur from the other physical disk drive.
While these approaches generally provide some improvement in overall operations, experience demonstrates that these approaches can actually slow the effective transfer rate of a particular block of data as defined in a file or in a like block in other environments that are now becoming more prevalent in commercial applications. Moreover, once a particular approach is adopted for a physical disk drive, it applies to all the data on that physical disk drive. Consequently if a particular approach were selected based upon anticipated conditions associated with a particular application and the applications subsequently were to change, it is likely that performance for the new conditions would suffer. Further, a physical disk drive storing different data sets or files could have different requirements. For example, reading operations for one application might retrieve data for a large block of successive locations whereas another application might read data from small or incremental blocks taken from random locations. In such a situation, selecting a particular reading process optimized for one might not be optimal for another. However, it was still the approach to select one process for the entire physical disk drive even though less than optimal performance might be realized.