The present invention relates to a memory for performing access or read/write in parallel with a plurality of independent storage units as a set, and more particularly to a data reconstruction system and a method used therein which are available in occurrence of a failure.
The technology for controlling discs arranged in parallel is disclosed in Japanese Kokai 1-250128 corresponding to U.S. patent application Ser. No. 07/118,785 filed on Nov. 6, 1987, now U.S. Pat. No. 4,870,643, and Japanese Kokai 2-135555.
As for the technology for achieving the large capacity of a memory and the high speed transfer of data, there is known a method in which the data is divided into a plurality of data of bit units, byte units or arbitrary units, with a plurality of storage units as a set, to be stored in the respective storage units, and when the data is to be read out, the plurality of data is simultaneously read out from the respective storage units. Moreover, in this method, the data to be used for a parity check is produced from the data divided among the storage units to be stored in another storage unit. When the failure occurs in any of the storage units, the data stored in the remaining normal storage units and the data for the parity check are used to reconstruct the faulty data, thereby to improve the reliability of the memory.
Further, there is known the technology in which when the failure occurs in any of the storage units, not only the data is reconstructed for the normal read operation, but also the data stored in the storage unit at fault is reconstructed to be stored in the normal storage unit which is additionally provided. With this technology, the reconstructed data is stored in the spare storage unit and the data is read out from the spare storage unit for the subsequent access, whereby it is possible to improve the availability of the memory.
The failure of a certain number of storage units can be repaired by providing the parity data, and the data can also be reconstructed by the provision of the spare storage unit. However, for the operation of repairing the failure, it is necessary to read out all of the data stored in the normal storage units and the data for the parity check, reconstruct the faulty data and write the reconstructed data to the spare storage unit. Therefore, during the repair of the failure, the storage units are occupied so that the request to process the normal access or read/write which is issued from a host unit continues to wait. This results in the degradation of the performance of the memory. As for the error check method for reconstructing the faulty data, there are known the parity data, Reed-Solomon Code and error check code (ECC) methods.
Although the redundancy is provided for the failure of a plurality of storage units, the failure repair in the failure of one storage unit and that in the failure of a plurality of storage units are managed without taking the distinction therebetween into consideration. Therefore, putting emphasis on the repair of the failure, since the processing of the normal access or read/write cannot be performed in spite of the failure of one storage unit, there arises a problem in that the efficiency of the processing of the normal access or read/write is reduced. On the other hand, putting emphasis on the normal access or read/write operation, there arises a problem in that the time required for the repair of the failure is not secure during the failure of a plurality of storage units, and as a result, the possibility that the whole system may break down will be increased.