This invention relates to a storage system having a redundant configuration and uses a semiconductor memory such as a flash memory, and more particularly, to a technique of improving processing performance and reliability.
In recent years, a non-volatile memory representative of a flash memory has been gaining attention. The flash memory is low power consumption as compared with a magnetic storage system, and therefore is suitably reduced in size and weight. For that reason, the flash memory is an external storage system that can be substituted for the magnetic disk drive.
The flash memory is characterized in that idle power consumption is low as compared with a dynamic random access memory (DRAM). This is because the DRAM requires periodic refresh operations necessary for memory holding. Also, the flash memory is low in power consumption because the flash memory has no actuator of the magnetic storage system such as a hard disk drive (HDD).
The flash memory is low in costs as compared with a static random access memory (SRAM) that is generally used as a main memory of a computer device. The SRAM does not require the refresh operation of the DRAM. However, the circuit is complicated as compared with the DRAM and the flash memory, whereby it is difficult to enhance the degree of integration.
The flash memory is small in size as compared with the magnetic storage system, and has the characteristic of the low power consumption as described above. Therefore, the flash memory is advantageous in that high-density mounting can be achieved as a main memory of a storage system.
Accordingly, it is expected that a flash memory drive having a plurality of flash memories is substituted functioning as the high-capacity main memory for the magnetic storage system functioning as the high-capacity main memory.
However, the flash memory has problems different from those of the SRAM, the DRAM, and the magnetic storage system. More specifically, the flash memory requires the erasing operation before data is overwritten. In the flash memory, conducting the erasing operation requires long time. As a result, the through-put performance at the time of overwriting the data recorded in the flash memory is inferior to that at the time of reading the data.
Also, the erasing operation before the data is overwritten cannot be performed by a block unit that is the minimum unit of reading and writing. The flash memory conducts the erasing operation by a page unit having a plurality of blocks described above.
In addition, the number of times of erasing data stored in the flash memory is limited to about 105 to 106 times. In this way, since the number of times of rewriting in the flash memory is limited, the number of times of erasing operation in the flash memory is made uniform in each of the areas to prevent the rewriting operation from concentrating on one area, to thereby extend the lifetime (refer to JP H05-27924 A and JP 3534585 B).
As described above, because the flash memory stores a plurality of blocks each of which is a unit of reading/writing in a page being a unit of erasing, the access units of the operation of erasing or reading/writing the data are different from each other. For that reason, in the flash memory, in the case where data is overwritten at the same address, it is necessary to write the data in a block having a different address which has been erased. Hence, a logical block address (LBA) in the reading and writing operation and a physical block address (PBA) that is managed in the interior of the flash memory drive are not always same order.
Accordingly, when the small-sized random overwriting operation is repeated, a fragment occurs. Then, when the above-mentioned operation is implemented, there can be created a page in which a block that waits for erasing and cannot be overwritten, and a readable block are mixed together. In order to erase the page including the erasing waiting block and the readable block, it is necessary to migrate the readable data to another area. In the case where the erasing operation is not conducted, an area of the data to be rewritten next depletes.
Thus, in order to ensure the write area, the flash memory migrates a block in use to another writable block from the page having the erasing waiting block and the readable block mixed together to conduct the operation for making the page erasing executable. The series of operation is generally called “reclamation”.
There is an external storage system (memory drive) having a plurality of non-volatile memories as the substitute of the magnetic storage system such as the HDD. In particular, the external storage system having flash memories being the non-volatile memories is called “flash memory drive (FMD)” hereinafter. Also, the control of the storage system using the plurality of flash memory drives is applied with a technique to be described below as in the conventional control method for the storage system having the plurality of magnetic storage systems, to thereby enhance the reliability of the storage system.
Further, the storage system of one kind is required in the robustness and has double configurational elements, to thereby enable the processing even in the case where a failure occurs in the configurational elements. In addition, in order to enhance the reliability of data and the processing performance, the plurality of storage systems are managed as one RAID (Redundant Array of Independence Disks) group through the RAID technique, and the data is made redundant and stored. The RAID group forms one or more logical storage areas. When data is stored in the storage area, the redundant data is stored in the storage system that constitutes the RAID group. Even in the case where one of the storage systems fails due to the redundant data, it is possible to restore the data. The RAID configuration is categorized plural levels which have different redundancy. Hereinafter, RAID 1, RAID 4 and RAID 5 will be described as typical RAID configuration.
According to RAID 1 configuration, all of data that has been stored in the drive is copied onto another drive. The capacitive efficiency total capacity being possible to use of the RAID 1 configuration is a half of the total capacity of physical capacity of disk drive.
RAID 4 configuration and RAID 5 configuration store an error correct code (ECC) that is calculated by a plurality of pieces of data in an ECC drive, and are capable of restoring the data that has been stored in the failed drive by the aid of the remaining data and the ECC even if a failure occurs in one of the drives.
However, according to the RAID 4 configuration, it is required to update the ECC data every time the data is written, and writing into the drive that only stores the ECC data induces the bottleneck of the write performance of the entire RAID group.
According to the RAID 4 configuration, redundant data (ECC) is always stored into the same drive (parity drive), on the other hand, according to the RAID 5 configuration, redundant data is stored into each drive included in RAID group (data drives and parity drive are not separated). Therefore, the RAID 5 configuration can rise up writing performance than the RAID 4 configuration, because redundant data is dispersedly stored into plural drives included in RAID group when data is written in the RAID 5 configuration. The capacitive efficiency is determined according to the ratio of the number of data drives to the number of parity drives.
The storage system that constitutes the RAID is incapable of restoring the data when a failure occurs in a given number of drives or more. Under the circumstances, the storage system provides a so-called “spare drive” that does not save data.
Then, in the case where a failure occurs in one of the drives that constitute the RAID, the storage system restores the data of the drive that has failed and stores the data in the spare drive by the aid of the data of the remaining drives that constitute the RAID. In this way, the spare drive is prepared in advance, thereby enabling to restore a degenerate state to a redundant state quickly. The above-mentioned operation in which data stored in the failed drive is restored and stored in a normal drive is called “collection copy” hereinafter.