1. Field of the Invention
The present invention relates to an information storage system and, more particularly, to an information storage system including a plurality of information storage devices each of which stores a part of information so that the entire information is stored as a whole.
2. Description of the Related Art
In recent years, a demand for a high-speed transfer of a large amount of sound or image data has been increased in relation to the rapid growth of internet communications via personal computers. However, peripheral systems such as a network for smoothly transferring and storing a large amount of data has not been sufficiently developed. Additionally, a network communication protocol technology for transmitting such a large amount of data in real time has not been sufficiently developed. Thus, it is desirous to urgently develop a system that can transfer a large amount of data at a high-speed by using existing techniques.
A system referred to as a redundant array of inexpensive disks (RAID) has been developed for retrieving and reproducing image data from an information storage apparatus having a high-speed transfer function, such as a non-linear editing of a video on demand (VOD) or a video image processing. The RAID system was revealed in the thesis titled xe2x80x9cA Case for Redundant Arrays of Inexpensive Disksxe2x80x9d written by David A. Peterson, Garth Gibson and Randy H. Katz, which thesis was published in 1987 the University of California. The RAID system uses an array of a plurality of hard disk drives so as to store a set of data by dividing the data into a plurality of data blocks. The data blocks are stored concurrently in the hard disk drives. Thereby, the RAID system can achieve a high-speed data transfer.
A description will now be given of the RAID system in detail. FIG. 1 is an illustration for explaining a system structure of the RAID system.
The RAID system 1 comprises a host computer 2 and a plurality of external information-storage devices 3-1 to 3-n.
When data is stored by the RAID system, there is a method in which the same contents are written in the plurality of external information-storage devices. Additionally, there is a method in which a set of data to be stored is divided into a plurality of data blocks (referred to as stripping), and data blocks are stored in the external information-storage devices on an individual data block basis. In this method, an error correction code is produced from the plurality of data blocks and stored in the external information-storage devices so as to improve data reliability. Accordingly, if one of the data blocks is lost, correct data can be restored according to the remaining data blocks and the error correction code.
It should be noted that there are six types, RAID-0 to RAID-5, the RAID system due to differences in dividing methods of a set of data and methods for managing the error correction code.
A description will now be given of RAID-0. In RAID-0, a set of data is divided or stripped into a plurality of data blocks. The data blocks are stored in different external information-storage devices, respectively. It should be noted that since the error correction code is not produced in RAID-0, there is no fault tolerance.
In RAID-1, two external information-storage devices are provided. A set of data is stored in one of the external information-storage devices, and the same contents are stored in the other. Accordingly, if one of the external information-storage devices fails and input and output operations cannot be performed, an operation with respect to the set of data can be continued by using the other one of the external information-storage devices.
In RAID-2, a set of data is stripped on an individual bit basis, and a Hamming code is produced for the stripped data as parity. The added error correction code is dispersed and stored in a plurality of external information-storage devices.
In RAID-3, a set of data is stripped on an individual bit or byte basis. Parity is used as an error correction code. The parity is stored in an external information-storage device that is assigned to exclusively store the error correction code.
In RAID-4, a set of data is stripped on a plurality of bytes basis. Parity is used as an error correction code. The parity is stored in an external information-storage device that is assigned to exclusively store the error correction code.
In RAID-5, a set of data is stripped on a plurality of bytes basis. Parity is used as an error correction code. Parities are dispersed and stored in a plurality of external information-storage devices.
Generally, RAID-0, RAID-1 and RAID-5 are used from among the above-mentioned RAID-0 to RAID-5, in consideration of a data transfer speed and an overhead necessary for various processes.
A detailed description will now be given of RAID-5, which is frequently used. FIG. 2 is an illustration for explaining a process performed in RAID-5. FIG. 2-(A) indicates a string of input data; and FIG. 2-(B) indicates a plurality of data blocks produced by stripping the input data and storing the stripped input data in the plurality of external information-storage devices.
In RAID-5, a data block obtained by stripping a set of data on a several-byte basis and a corresponding error correction code are dispersed and stored in a plurality of external information-storage devices. For example, when the input data string A1 to A28 shown in FIG. 2-(A) is input, a plurality of single data blocks are formed by dividing the input data string by each predetermined number of sets of individual data. Accordingly, the set of input data is divided into data blocks BL1 to BL7.
The data blocks BL1 to BL7 are stored in a plurality of external information-storage devices 3-1 to 3-4. At this time, an exclusive OR operation is performed with respect to the data blocks BL1 to BL7 so as to produce parities P1 to P3. The parities P1 to P3 are dispersed and stored in the external information-storage devices 3-1 to 3-4 so that parities are not concentrated in a particular one of the external information-storage devices.
When the set of data A1 to A28 is read, the data blocks BL1 to BL7 and the parities P1 to P3 are read concurrently so that the original set of data can be restored in a short time. That is, a data transfer speed can be increased by concurrently accessing the external information-storage devices 3-1 to 3-4 so as to concurrently read a plurality of data blocks and parities.
In order to establish the RAID system, one of a hardware method and a software method can be selected. When the hardware method is selected, the calculation of parities, the stripping of the set of data and the restoration of the original set of data are performed by an exclusive circuit such as a RAID controller. According to the hardware method, a high-speed processing can be achieved but a complex operation is required for introducing new hardware and such hardware is expensive. When the software method is selected, new hardware is not required and an installation cost is low. However, the software method has a lower performance than the hardware method.
The RAID system has, on one hand, an advantage that the improvements in data transfer speed and reliability can be achieved but, on the other hand, there is a disadvantage that a storage capacity and a method of RAID cannot be easily changed once the system is put into practical use. For example, in order to increase a storage capacity, it is required to perform formatting of recording media, backup of existing data and restoration of data. This causes a stop of service or a decrease in the data transfer speed during an operation for increasing the storage capacity.
FIG. 3 is a flowchart of an operation for changing a structure of a conventional RAID system.
When a structure of the RAID system is changed, data stored in all of the external information-storage devices of the RAID system is read in step S1-1, and the read data (backup data) is stored in an information storage apparatus other than the RAID system. In step S1-2, the operation of the RAID system is stopped. Thereafter, in step S1-3, a desired change of a structure of the RAID system such as addition or deletion of external information-storage devices is performed. Then, in step S1-4, the structure of the RAID system is recognized. In step S1-5, formatting is performed in accordance with the recognized structure of the RAID system. In step S1-6, the backup data stored in step S1-1 is returned to the RAID system by performing stripping and calculations of parities in accordance with the newly set format. Then, in step S1-7, an operation of the RAID system is started, and the routine is ended.
As mentioned above, in order to change the system structure of the conventional RAID system, a stop of an operation of the system, backup of data stored in the RAID system and formatting of the RAID system must be performed, which requires time and labor.
In order to eliminate the above-mentioned problem, a method for increasing a storage capacity without stopping system operations or decreasing a data transfer speed has been suggested. In the suggested method, stripping information such as information regarding a unit of a data block, a number of data blocks or stored addresses is stored together with the data blocks so that different sets of data can be stored in the same RAID system according to different stripping methods.
Specifically, the host computer is connected to the external information-storage devices via an interface that can detect connection of an apparatus without turning off the power. When a write request is made, data to be stored is stripped and data blocks and stripping information are stored in the external information-storage devices. If additional external information-storage devices are added so as to increase a storage capacity, the data to be written is stored with the stripping information in accordance with a new structure of the RAID system. When the data stored in the RAID system is read, the stripping information is retrieved first. Then, the data blocks are read in accordance with the retrieved stripping information so as to restore the original set of data.
According to the above-mentioned method, a stripping method corresponding to a present structure of the RAID system can be used. Thus, the RAID system can be restructured without formatting or backup of data.
A description will now be given of the above-mentioned RAID system in detail.
FIG. 4 is an illustration for explaining a state of the RAID system of which a structure is changeable without stopping an operation before being changed. FIG. 4-(A) indicates an input data string before a change is made; FIG. 4-(B) indicates a state of data stored in the external information-storage devices 3-1 to 3-3 before the change is made. FIG. 5 is an illustration for explaining a state of the RAID system shown in FIG. 4 after the change is made. FIG. 5-(A) indicates an input data string after the change is made; FIG. 5-(B) indicates a state of data stored in the external information-storage devices 3-1 to 3-4 after the change is made.
In the RAID system shown in FIG. 4, the external information-storage devices 3-1 to 3-3 are provided. When the data string A1-A8, B1-B8 and C1-C8 is supplied to the RAID system as shown in FIG. 4(A), the data A1-A8 is stripped into data blocks BL1 and BL2. The data block BL1 includes data A1 to A4, and the data block BL2 includes data A5 to A8. At this time, a parity 1 is produced for performing an error correction and stripping information S1 representing a method for stripping the data A1-A8 is also produced. The data blocks BL1 and BL2, the parity 1 the stripping information S1 are dispersed and stored in the external information-storage devices 3-1 to 3-3. That is, the data block BL1 is stored in the external information-storage device 3-1 together with the stripping information S1. The data block BL2 is stored in the external information-storage device 3-2. The parity 1 is stored in the external information-storage device 3-3.
The data B1-B8 is stripped into data blocks BL3 and BL4. The data block BL3 includes data B1 to B4, and the data block BL4 includes data B5 to B8. At this time, a parity 2 is produced for performing an error correction and stripping information S2 representing a method for stripping the data B1-B8 is also produced. The data blocks BL3 and BL4, the parity 2 and the stripping information S2 are dispersed and stored in the external information-storage devices 3-1 to 3-3. That is, the data block BL3 is stored in the external information-storage device 3-1 together with the stripping information S2. The data block BL4 is stored in the external information-storage device 3-3. The parity 2 is stored in the external information-storage device 3-2.
The data C1-C8 is stripped into data blocks BL5 and BL6. The data block BL5 includes data C1 to C4, and the data block BL6 includes data C5 to C8. At this time, a parity 3 is produced for performing an error correction and stripping information S3 representing a method for stripping the data C1-C8 is also produced. The data blocks BL5 and BL6, the parity 3 and the stripping information S3 are dispersed and stored in the external information-storage devices 3-1 to 3-3. That is, the data block BL5 is stored in the external information-storage device 3-2 together with the stripping information S3. The data block BL6 is stored in the external information-storage device 3-3. The parity 3 is stored in the external information-storage device 3-1.
If the additional external information-storage device 3-4 is added to the RAID system as shown in FIG. 5, the system detects the addition of the new external information-storage device 3-4, and data subsequently input to the changed RAID system is stored in the four external information-storage devices 3-1 to 3-4.
When a set of data D1-D8, E1-E8 and F1-F8 is supplied to the RAID system as shown in FIG. 5-(A), the data D1-D8 is stripped into data blocks BL7 and BL8. The data block BL7 includes data D1 to D4, and the data block BL8 includes data D5 to D8. The data E1-E8 is stripped into data blocks BL9 and BL10. The data block BL9 includes data E1 to E4, and the data block BL10 includes data E5 to E8. The data F1-F8 is stripped into data blocks BL11 and BL12. The data block BL11 includes data F1 to F4, and the data block BL12 includes data F5 to F8. At this time, a parity 4 is produced for performing an error correction with respect to the data D1 to D8 and E1 to E4, and a parity 5 is produced for performing an error correction with respect to the data E5 to E8 and F1 to F8. Additionally, stripping information S4 representing a method for stripping the data D1-D8 and the data E1 to E4 is produced, and stripping information S5 representing a method for stripping the data E5-E8 and data F1 to F8 is produced.
The data blocks BL7, BL8 and BL9, the parity 4 and the stripping information S4 are dispersed and stored in the external information-storage devices 3-1 to 3-4. That is, the data block BL7 is stored in the external information-storage device 3-1 together with the stripping information S4. The data block BL8 is stored in the external information-storage device 3-2. The data block BL9 is stored in the external information-storage device 3-3. The parity 4 is stored in the external information-storage device 3-4.
Additionally, the data blocks BL10, BL11 and BL12, the parity 5 and the stripping information S5 are dispersed and stored in the external information-storage devices 3-1 to 3-4. That is, the data block BL10 is stored in the external information-storage device 3-1 together with the stripping information S5. The data block BL11 is stored in the external information-storage device 3-2. The data block BL12 is stored in the external information-storage device 3-4. The parity 5 is stored in the external information-storage device 3-3.
In the method described with reference to FIGS. 4 and 5, an amount of data stored in the added external information-storage device 3-4 is not equal to an amount of data stored in each of the external information-storage devices 3-1 to 3-3.
FIG. 6 is an illustration for explaining a problem that occurs when a structure of the RAID system is changed.
Suppose that the external information-storage device 3-4 is added after data blocks DB11, DB12, DB21, DB22, DB31, DB32, DB41, DB42, DB51, DB52, DB61, DB62, DB71, DB72, DB81 and DB82 are stored in the external information-storage devices 3-1 to 3-3 together with parities P1 to P8. In this state, when new data blocks DB91, DB92, DB93, DB01, DB02 and DB03 are supplied to the RAID system, the data blocks DB91 and DBO1 are stored in the external information-storage device 3-1; the data blocks DB92 and DB02 are stored in the external information-storage device 3-2, the data blocks DB93 and DB03 are stored in the external information-storage device 3-3; and a parity P9 with respect to the data blocks DB91, DB92 and DB93 and a parity P10 with respect to data blocks DB01, DB02 and DB03 are stored in the external information-storage device 3-4.
Accordingly, the newly added external information-storage device 3-4 stores the data blocks that are supplied after the device is added. On the other hand, each of the external information-storage devices 3-1 to 3-3 stores the data blocks that are supplied before and after the external information-storage device 3-4 is supplied. As a result, an amount of data stored in the external information-storage device 3-4 is less than an amount of data stored in each of the external information-storage devices 3-1 to 3-3. That is, there is a deviation in an amount of data stored in the external information-storage devices. Accordingly, an input and output operation cannot be dispersed evenly to each of the external information-storage devices. This causes deterioration in efficiency of storage devices and interfaces. Thus, a maximum data-transfer speed that is obtainable cannot be achieved.
Additionally, in the RAID system, it is possible that a number of external information-storage devices is decreased so as to decrease a management cost such as a cost for power supply (power consumption). However, the above-mentioned conventional RAID system does not have such a function to restore the data, which is stored before a change is made, in accordance with a data structure after the change of the system structure. Thus, in the conventional RAID system, the external information-storage devices cannot be removed.
Further, if an amount of data stored in one of the external information-storage devices reaches its maximum storage capacity, the RAID system cannot store any new data even if the external information-storage devices other than the one which stores the maximum data amount can still store additional data.
A description will now be given of another problem in the conventional RAID system. FIG. 7 is an illustration for explaining the problem which may occur in the conventional RAID system.
Suppose that the RAID system shown in FIG. 7 is constructed by sequentially adding the external information-storage devices 3-3, 3-4 and 3-5, in that order, to the RAID system provided with the external information-storage devices 3-1 and 3-2. When data blocks D11, D21, D31 and D41 are supplied to the RAID system in a state in which only the external informaton-storage devices 3-1 and 3-2 are provided, the data blocks D11 and D21 and parities P3 and P4 with respect to the data blocks D31 and D41 are stored in the external information-storage device 3-1 and the data blocks D11 and D21 and parities P1 and P2 with respect to the data blocks D11 and D12 are stored in the information-storage device 3-2.
Thereafter, when data blocks D51, D52, D61, D62, D71, D72, D81 and D82 are supplied to the RAID system after the external information-storage device 3-3 is added to the RAID system, the data blocks D51, D61, D71 and D81 are stored in the external information-storage device 3-1. The data blocks D52 and D62 and parities P7 and P8 with respect to the data blocks D71, D72, D81 and D82 are stored in the external information-storage device 3-2. The data blocks D72 and D82 and parities P5 and P6 with respect to the data blocks D51, D52, D61 and D62 are stored in the external information-storage device 3-3.
When the external information-storage device 3-4 is added and data blocks D91, D92, D93, D01, D02 and D03 are supplied to the RAID system, the data blocks D91 and D01 are stored in the external information-storage device 3-1, the data blocks D92 and D02 are stored in the external information-storage device 3-2 and the data blocks D93 and D03 are stored in the external information-storage device 3-3. Additionally, parities P9 and P0 with respect to the data blocks D91, D92, D93, D01, D02 and D03 are stored in the external information-storage device 3-4.
If the external information storage devices 3-1 and 3-2 become full after the data blocks D91, D92, D93, D01, D02 and D03 are stored, data blocks cannot be stored any more in the RAID system even if the external information-storage device 3-5 is added. Accordingly, as indicated by dotted circles in FIG. 7, unusable empty storage areas remain in the external information-storage devices 3-3, 3-4 and 3-5.
It is a general object of the present invention to provide an improved and useful information storage system in which the above-mentioned problems are eliminated.
A more specific object of the present invention is to provide an information-storage device which can store data evenly in each of a plurality of external information-storage devices even if the number of external information-storage devices provided in the information storage system is changed.
In order to achieve the above-mentioned object, there is provided according to one aspect of the present invention an information storage system for storing information, comprising:
a plurality of information storage devices;
a division processing unit dividing the information to be stored in the information storage devices and distributing the divided information to the information storage devices; and
a division control unit controlling the division processing unit so as to divide the information in accordance with a number of the information storage devices,
wherein, when the number of the information storage devices is changed, the division control unit controls the division processing unit so that the information stored in the information storage devices is divided by a dividing method suitable for the number of the information storage devices after being changed.
According to the present invention, when a structure of the information storage devices is changed, that is, when the number of the information storage devices provided in the information storage system is increased or decreased, the previously stored information is divided in accordance with the new structure of the information storage devices. Accordingly, the information previously stored in the information storage devices can be automatically divided according to the dividing method suitable for the new structure and redistributed to the information storage devices of an increased or decreased number. Additionally, the information stored after the structure is changed can be divided by the dividing method suitable for the new structure of the information storage devices. Thus, a storage area of each of the information storage devices can be efficiently used.
Additionally, there is provided according to another aspect of the present invention a processor readable medium storing program code for causing a computer to store information in a plurality of information storage devices in accordance with the dividing method mentioned above.
Other objects, features and advantages of the present invention will become more apparent from the following detailed description when read in conjunction with:the accompanied drawings.