The invention relates generally to the field of storing data on computer storage devices such as disks. More particularly, the invention provides a technique for storing direct images of data on asymmetrically-sized disks by mirroring data contained on a disk or disks of one capacity on a disk or disks of another size.
Data stored on storage media such as disks must be protected from adverse events including human errors, equipment failures and adverse environmental conditions. Additionally, the ability of a disk system to maintain immediate on-line access to data contained on the disk despite a failure has become important with the proliferation of on-line, interactive computing.
One of the methods of storing data developed to address these needs is RAIDxe2x80x94Redundant Arrays of Independent Disk drives. Typically, RAID employs a number of homogeneous drives to derive lost or corrupted data from other members of a set. Various schema describing how data and redundant data are mapped across the multiple disks of an array to provide data availability and protection are classified as RAID Levels 1-6.
RAID can provide redundancy in the form of mirroring (RAID 1) or in the form of parity (RAID 3, 4, 5 and 6). Mirroring of data in a RAID 1 implementation involves writing an exact image of the data on a second disk. Typically, implementations of RAID 3, 4, 5, or 6 involve the use of at least 3 disks of identical capacity, where at least two disks are used for writing data and one disk is used to store parity data. In other words, parity data resides on a disk other than the two or more disks containing the data from which the parity was generated. With parity-based RAID implementations, redundancy of data (overhead) can be reduced from 100 percent (the case for mirroring) to between 10 and 33 percent. Parity-based RAID implementations may suffer from poor performance, even during normal (non-failure) conditions because of the need to generate and write parity during a write operation. During abnormal conditions, poor performance is exacerbated by the need to regenerate or recalculate data using parity data. Performance in mirrored systems is typically better than in parity systems because data does not need to be regenerated; it just needs to be read from a different disk. The disadvantage of mirroring is that for each disk mirrored, a second identical disk, must be purchased.
In view of the foregoing, there is a need for a storage system that overcomes the drawbacks of the prior art.
The present invention is directed to improved systems and methods for storing data, wherein data stored on one or more disks of a first capacity is mirrored to one or more disks of a second, different capacity. The invention effectively provides a new level of RAID.
According to the invention, one or more disk drives of a first capacity may be coupled to create a virtual volume. In one embodiment, one or more disks of a second, larger capacity are then used to provide a single larger volume or multiple larger volumes that serve as the mirroring drive or drives for the array of smaller drives. Data from the smaller drives is stacked in stripes sequentially across the larger drive(s). Alternately, data from the smaller drives may be striped in zones across the larger drive(s). In another embodiment, the asymmetric nature of the mirroring technique of the present invention can be used in reverse, wherein an array of smaller capacity drives serve as the mirror for one or more larger capacity drives.
Because no parity calculations are required, the present invention increases performance both in normal and in fault conditions. In the case of a failure of a single drive, access to any of the remaining data drives is not required for data recovery, hence improving performance and reducing required resources for recovery. Also, the time to rebuild a failed data disk is reduced, minimizing the period of time in which the system is running under fault conditions. Multiple failures of the data drives do not impact the mirroring drive(s). Restoration of data to a failed drive is a direct image of the data and therefore does not require reconstitution of the required data from the remaining data drives. If a third party or active drive capabilities exist, data can be restored without consuming host processor resources and bandwidth. Overhead is reduced from 100% for normal mirroring systems to some fraction of the total number of data drives, if the mirroring drive(s) is(are) the larger-size drive(s). Alternatively, making the larger drive(s) the data drive(s) and the smaller drives the mirroring drives facilitates the breaking off of a portion of the database to give to a user. The drives do not have to be initialized in order to maintain data coherency. The present invention supports multiple simultaneous read/write operations.
Other features of the invention are described below.