The present invention relates in general to a storage apparatus system. More particularly the present invention relates to a storage apparatus system which has redundancy in each configuration element thereof for increasing the availability of the storage apparatus system.
Storage systems using redundant disk arrays are well known as evidenced by "A Case for Redundant Arrays of Inexpensive Disk (RAID)", by D. A. Patterson, et al. ACM SIGMOD conference proceeding, Chicago, Ill., Jun. 1, 1988, pp. 109 to 116 (Document 1).
A disk array is a disk system for increasing the performance and the reliability of the disk system. In a disk array, a plurality of physical disk drives are arranged so that the physical disk drives appear to an information processing unit as a single logical disk drive in order to enhance the performance. In order to improve reliability, redundant data is stored in a separate disk drive so that data existing prior to the occurrence of a failure can be recovered.
In general, units in which data is read and written from and to a disk drive is called a record. Document 1 proposes some techniques for arranging records. When a disk array is used, however, a record seen from an information processing unit as a read/write unit have a data length different from that of a record actually stored in a disk drive. Hereafter, the former and the latter are referred to as a logical record and a physical record respectively. The following is description of some techniques of arranging records which are proposed in Document 1.
According to a first technique of arranging records, a logical record, that is, a record seen from the processing apparatus side, is stored in a disk drive by dividing the logical record into m physical records where m.gtoreq.1. This technique is referred to hereafter as a division arrangement technique. By adopting the division arrangement technique, one logical record can be stored in m units of disk drives. Thus, the data transfer speed appears to the information processing unit as a speed m times faster then if the logical record was stored in a single disk device.
Next, a method of creating redundant data according to the division arrangement technique is explained. According to the division arrangement technique, for m physical records resulting from the division of a logical record, n pieces of redundant data are created where n.gtoreq.1. The pieces of redundant data are each stored in disk drives as a physical record. Thus, the pieces of redundant data are stored in the disk drives as a total of n physical records. In order to distinguish a physical record of data directly read out or written by an information processing unit from a physical record of redundant data, the former and the latter are referred to hereafter as a data record and a parity record respectively. A group including m data records and n parity records is known as a parity group. In general, if there are n parity records in a parity group, data in the parity group can be recovered when failures occurs in up to n units of disk drives. It should be noted that since a disk drive is normally used for storing a plurality of records, a disk drive contains records pertaining to a plurality of parity groups. Here, (m+n) units of disk drives containing a plurality of parity groups each including (m+n) records is referred to hereafter as a disk parity group. That is to say, a disk parity group is a set of disk drives sharing common redundant data.
According to a second technique of arranging records, one logical record, that is, a read/write unit seen from the processing apparatus side, is stored in a disk drive as a physical record, that is, as one data record. The second technique is referred to hereafter as a non-division arrangement technique. Thus, one logical record is equivalent to one data record. Also in this case, since a physical record can be a data record or a parity record, a physical record is not necessarily equivalent to a logical record. That is to say, a logical record is stored as a physical record but a physical record is not always a logical record. Because a physical record may be a parity record. The non-division arrangement technique offers a feature that read/write processing can be individually carried out on each of the disk drives constituting a disk array.
When the division arrangement technique is adopted, on the other hand, it is necessary to occupy a plurality of disk drives to expedite one read/write processing. As a result, by adopting the non-division arrangement technique, the enhancement of the concurrence of the read/write processing that can be executed in a disk array can be realized. Also in the case of the non-division arrangement technique, n parity records are created from m data records and stored in disk drives. In the case of the division arrangement technique, however, a set of data records pertaining to a parity group appears to an information processing unit as a logical record. Meanwhile, in the case of the non-division arrangement technique, on the other hand, each data record appears to an information processing unit as a completely independent logical record.
U.S. Pat. No. 4,914,656 (Document 2) disclosed a technology in which a spare disk is provided in a disk array. In a disk array, data can be recovered by using recovered data in the event of a failure occurring in a disk drive. However, a new disk drive is required for storing redundant data. A spare disk is an unused disk drive provided in a disk array in advance. Thus, data recovered by using redundant data in the event of a failure occurring in a disk drive can be stored immediately in a spare disk.
In many computer systems, storage apparatuses other than the disk drive are employed. Examples of such storage apparatuses are a magnetic tape and an optical storage device. Recently, much attention has been given to DVD (Digital Versatile Disk). A feature of these storage apparatuses is that, in either case, a storage medium thereof is a separated component from a R/W (Read/Write) unit. Data is read or written from or to a storage medium which is mounted on the R/W unit. In general, such a storage medium is known as a portable medium. In order to implement the control of a very large number of portable storage media with ease in a large-scale computer system, the concept of a library is introduced. A library usually includes not only a large number of storage media and a R/W unit, but also equipment such as a robot for transporting a storage medium back and forth between the R/W unit and a medium accommodating rack for accommodating the storage media.
Since the amount of data handled in a computer system is becoming larger and larger with time, the need for the enhancement of the availability of the data is also extremely high. For this reason, by applying a concept like the one proposed in Document 1 to a storage apparatus system including portable storage media as described above, high data availability can be realized.
Applying such a concept to portable storage media is disclosed, for example, in "DVD Applications," by A. E. Bell of IBM Research Division, Comdex 96, Nov. 20, 1996 (Document 3). Document 3 proposes RAIL (Redundant Arrays of Inexpensive Libraries) having redundancy and including a plurality of ordinary libraries each composed of DVDs, a R/W unit and a robot.
As disclosed in Document 2, in the event of a failure occurring in a disk drive of a disk array, data recovery processing by using redundant data to generate recovered data that is stored in a spare disk is immediately performed. Data recovery processing is immediately conducted using the spare disk to avoid a long period of operation of the disk array using degraded redundancy.
Thereafter, the disk incurring the failure is replaced by a new disk by a person in charge of the storage apparatus system such as a maintenance technician.
In order to make the operation of the disk array simple, it is desirable to have a capability for easily recognizing a set of disk drives pertaining to the same disk parity group, that is, a capability of easily identifying a set of disk drives sharing common redundant data. This is because, if disk drives more than a number indicated by the degree of redundancy are removed by mistake from the set of disk drives pertaining to the same disk parity group, for example, access to data stored in the disk parity group cannot be effectively performed.
In order to avoid the above-described problem, it is necessary to carry out a copy-back operation for restoring the data stored in the spare disk back to a new disk drive used as a replacement for the one incurring the failure. Since the copy-back operation necessitates that the entire data of the disk be transferred, the amount of such data can be very large. Thus, the copy-back operation can be very time consuming. A similar problem is also encountered in a storage apparatus system based on portable storage media as well.