The present invention relates to data storage systems and more particularly to a method and system for improving performance of a RAID (redundant array of inexpensive disks) data storage subsystem.
In order to store data, some computer systems use a redundant array of inexpensive disks (xe2x80x9cRAIDxe2x80x9d) data storage subsystem. For example, a RAID subsystem may be coupled with a host or server that services clients on a network. The RAID subsystem typically includes a controller and a plurality of disk drives. The controller generally controls operations of the RAID subsystem. Information is physically stored on the drives.
There are many conventional techniques for using the RAID data storage subsystem. RAID levels are typically used to determine how data will be stored in a RAID subsystem. Each technique has different performance characteristics and a different ability to provide redundancy. Redundancy is desirable in order to be able to recover data on a drive that becomes defective. However, it is also desirable to provide sufficiently rapid access times in order to ensure that performance does not suffer.
RAID-0 stores the data on the drives of the conventional RAID subsystem. Data are interleaved by striping the blocks of data across the drives. Typically, a block is from 8K through 64K bytes. However, RAID-0 does not have any redundancy.
RAID-1 uses a technique called xe2x80x9cmirroring,xe2x80x9d in which data are stored on a first drive and a copy of the data are stored on a second drive. Because data are stored on two disks, a request for data can be serviced by one drive, while the other drive is servicing a second request. However, a great deal of space is consumed because two copies of the data are stored.
RAID-2 stripes bits of data across multiple drives and uses error-correcting codes in addition to data striping. RAID-3 also stripes bits of data across multiple drives. Parity bits are stored on a separate drive or drives to provide redundancy. In addition, the drives storing data can be operated in unison and in parallel. Thus, a request can be simultaneously serviced by all data drives. RAID-4 is similar to RAID-3 except that the data drives are operated independently and blocks of data are striped across the data drives.
RAID-5 is the most widely used RAID level. RAID-5 stripes blocks of data over multiple drives and uses parity bits. However, unlike RAID-4, RAID-5 does not use separate dedicated parity disks. Instead, RAID-5 distributes the parity bits over multiple drives. RAID-6 stripes data and parity bits across multiple drives. However, RAID-6 uses an additional parity calculation.
Although RAID-5 is widely used in RAID subsystems because of the cost efficiency of RAID-5, RAID-5 uses a read-modify-write to calculate parity. The read-modify-write calculation is thus performed for each write. This calculation is time consuming. Thus, RAID-5 has relatively slow write performance. Although some sequential workloads and write-back cache designs improve performance, other workloads are used in RAID subsystems. For example, some applications have a large number of reads and a relatively small number of writes. Such applications may not have sequential workloads or utilize write-back caches. Consequently, the write performance for these other workloads still suffers.
Accordingly, what is needed is a system and method for more efficiently providing writes in a RAID subsystem, particularly for applications having a large number of read requests as compared to write requests. The present invention addresses such a need.
The present invention provides a method and system for storing data in a redundant array of inexpensive disks (RAID) data storage subsystem. The RAID data storage subsystem includes a plurality of drives. The method and system comprise temporarily storing data in a first portion of the plurality of drives using a first RAID level and storing the data in a second portion of the plurality of drives using a second RAID level. The step of relatively permanently storing the data is performed at a time when performance of the system is not adversely affected by storage using the second RAID level. Furthermore, the temporary storing step and the step of storing the data using the second RAID level can be performed throughout operation of the RAID data storage subsystem.
According to the system and method disclosed herein, the present invention essentially allows RAID data to be cached using a first RAID level. Later, the data may be stored using the second RAID level. The second RAID level requires more time for storing data than the first RAID level but may have other benefits, such as less wasted space on the drives being used.