The present invention relates generally to placing and reconstructing data in a storage system. More particularly the present invention relates to a method, apparatus and computer program for placing and reconstructing data in an object-based storage system.
Recent storage systems that are used in the high-end or mid-range computer systems usually adopt the Redundant Array of Inexpensive Disks (RAID) architecture. The RAID architecture provides various levels including, for example, RAID levels 1-5 defined as follows:
RAID 1: uses “mirroring” techniques.
RAID 2: stores each bit of each word of data, plus Error Detection and Correction (EDC) bits for each word, such as a Hamming Code, on separate disk drives. This is also known as “bit striping”.
RAID 3: is based on the concept that each disk drive storage unit has internal means for detecting a fault or data error. Therefore, it is not necessary to store extra information to detect the location of an error. Thus, a simpler form of parity-based error correction can be used. In this approach, the contents of all storage units subject to failure are “Exclusively OR-ed” (XOR) to generate parity information. The resulting parity information is stored in a single redundant storage unit. If a storage unit fails, the data on the redundant storage unit can be reconstructed onto a replacement storage unit by XOR-ing the data from the remaining storage units with the parity information.
RAID 4: uses the same parity error correction concept of the RAID 3 architecture, but improves on the performance of the RAID 3 architecture with respect to random reading of small files by using block-level interleaving. The RAID 4 architecture reads and writes a larger minimum amount of data, typically a disk sector, to each disk. This is also known as block striping. A further aspect of the RAID 4 architecture is that a single storage unit is designated as the parity unit.
RAID 5: uses the same parity error correction concept of the RAID 4 architecture and independent actuators, but improves on the writing performance of the RAID 4 architecture by distributing the data and parity information across all of the available disk drives. Typically, “N+1” storage units in a set, also known as a “redundancy group,” are divided into a plurality of equally sized address areas referred to as blocks. Each storage unit generally contains the same number of blocks. Blocks from each storage unit in a redundancy group having the same unit address ranges are referred to as “stripe group”. Each stripe group has N blocks of data, plus one parity block on one storage unit containing parity for the remainder of the blocks. Further, each stripe group has a parity block, wherein the parity blocks are distributed on different storage units. Parity updating activity associated with every modification of data in a redundancy group is therefore distributed over the different storage units. No single storage unit is burdened with all of the parity update activity.
As described above, the current RAID techniques generate redundant information on either one of a bit, byte, or block basis.
Recently another technological trend has begun, where data is stored into the storage devices such as magnetic disks on an object basis, not a block basis. Such a storage device or system which stores data as an object is called an Object-Based Storage Device (OSD).
Each object as defined includes data and an attribute. The Attribute includes information such as the size of the data, user identification (ID) information, etc. Since the attribute maintains the size of the data in the object, the object size is variable, different from current storage devices such as Hard Disk Drives (HDDs) which use fixed-length blocks. These objects can be used to store a variety of data, including files, database records and e-mail. The combination of data and attribute allows an OSD to make decisions on data layout or quality of service on a per-object basis, thereby improving performance, flexibility and manageability.
The disadvantage of RAID architectures is that they take a large amount of time to reconstruct data of a failed disk drive, because the storage system adopting the RAID architecture reads all data blocks in the disk drives that are associated with the failed disk drive. The data reconstruction process of the RAID architecture has to be performed on all data blocks of the disk drives even if only a small amount of data stored in the disk drives require reconstruction. Such is partly due to traditional RAID systems not maintaining information regarding the location of data or files in the disk drives.
OSD maintains information indicating where each object is stored in the OSD. However, current OSD standards do not take into consideration the functionalities necessary to implement a RAID architecture in an OSD. Particularly, the current OSD standards do not provide a method for the placement and reconstruction of data in an OSD which implements a RAID architecture.
U.S. Pat. No. 5,208,813 discloses a method for reconstructing data of failed storage devices in redundant arrays. However, the method disclosed in U.S. Pat. No. 5,208,813 is only applied to data reconstruction of all data blocks in the failed storage devices implementing a RAID architecture.
ANSI T-10 Working Draft discloses the concept of OSD (ANSI T-10 Working Draft). However, this draft does not consider the functionalities or features necessary to apply OSD technology to RAID storage systems.
Therefore, a need exists for a method and apparatus for placing objects on an RAID storage system and reconstructing data in the RAID storage system.