The present invention relates to drive arrays, and more particularly to a method and apparatus for zeroing transfer buffer memory as a background task to improve the performance of read and write operations to a drive array.
Personal computer systems are continuing to develop at a relatively high pace due to substantial improvements in performance and density of semiconductor technology. The continual increase of computer system performance and use has lead to a corresponding need for improvements in the performance, capacity and reliability of secondary storage systems. Disk or drive arrays were proposed as an alternative to large and expensive magnetic disk drives. Several different levels of redundant arrays were introduced and analyzed in a landmark paper xe2x80x9cA Case for Redundant Arrays of Inexpensive Disks (RAID)xe2x80x9d by D. Patterson, G. Gibson and R. Katz, Report No. UCB/CSD 87/391, December, 1987, Computer Science Division, University of California, Berkeley, Calif. As described in a later article by P. Chen, E. Lee, G. Gibson, R. Katz and D. Patterson, xe2x80x9cRAID: High-Performance, Reliable Secondary Storagexe2x80x9d, ACM Computing Surveys, Vol. 26. No. 2, June 1994, RAID technology has grown substantially and provides a natural solution to the continually growing demands for larger and improved storage systems.
A drive array is a collection of hard disk drives, otherwise referred to as physical drives, which are grouped together to create an array of physical drives. A drive array includes one or more subsets called logical drives or logical volumes which are typically spread across all of the physical drives in the drive array. An operating system views a logical drive as a single, contiguous storage space, even though the storage space may be made up of portions of several physical drives. One reason for building a drive array subsystem is to create a logical device that has a relatively high data transfer rate. A higher transfer rate may be accomplished by xe2x80x9cgangingxe2x80x9d multiple physical drives together and transferring data to or from the drives in parallel. For example, striping techniques are often used to distribute the data in a drive array. In striping, data is broken into segments of a unit length and sequential segments are written to several disk drives rather than to sequential locations on a single physical drive. The combination of corresponding sequential data segments across each of the disks in the disk array is a stripe. The stripe size affects data transfer characteristics and access times and is generally chosen to optimize data transfers to and from the disk array. The unit length is referred to as a block or segment or strip and usually includes one or more sectors, where each sector is 512 bytes. The first RAID level 0 uses data striping to achieve greater performance but does not use any fault tolerance techniques for data protection.
Data protection is another reason for using drive arrays, where fault tolerance methods are implemented within the array to protect the data against hardware failures. A popular solution is called mirroring or shadowing and is the technique used for RAID level 1. A drive array incorporating RAID level 1 includes a mirrored segment for each data segment, where the data is copied to both a data drive and a mirrored drive resulting in two copies of the same information. Alternatively for odd drive mirroring, data and mirrored segments are distributed among an odd number of drives greater than or equal to three. Mirroring provides the advantages of high reliability and relatively fast transfer rate but at a cost of storage efficiency since the storage space is utilized at a maximum of 50%. The higher RAID levels 2-5 use a parity scheme to achieve data redundancy. In the parity schemes, a controller writing data blocks to various drives within the array use the EXCLUSIVE-OR (XOR) function to create parity information, which is then written to a parity drive or parity segment within the array. For example, in a block-interleaved parity drive array according to RAID level 4, data is interleaved or striped across the disks and the corresponding parity information is stored in a corresponding block of a parity drive. A block-interleaved distributed parity drive array according to RAID level 5 is similar to RAID level 4, except that the parity information and the data is uniformly distributed across the drive array. The RAID levels 4 and 5 provide greater storage space efficiency than mirroring although typically at lower performance.
A computer system implementing RAID levels 4 and 5 achieve fault tolerance by calculating parity across drives of the array. The XOR operation is performed on each segment of data from each data drive in a drive array at a given offset and the result is stored (normally at the same offset) in a parity disk drive or a parity segment. The XOR operation is a time consuming activity because of the need to perform several read and write operations to update data and parity information. Existing or old data is read and combined with new data, and the results are written back to appropriate locations of the drive array. Various methods are known. In read-modify-write (RMW) operations, for example, old data and parity blocks are XOR""d with corresponding blocks of new data to be written to generate new parity data blocks, and the new data and parity blocks are written back to the array. In a regenerative write operation, remaining valid data is read from corresponding sectors of a stripe of data, XOR""d with new data to be written to generate a new parity block, and the new data and parity blocks are written back to the drive array.
In some drive array architectures, such as the SMART and SMART-2 Array Controllers by Compaq Computer Corporation, the XOR operations are performed in a region of memory called a xe2x80x9ctransfer bufferxe2x80x9d. The transfer buffer is a bank of memory within a drive controller that may include a multi-threaded interface. Control logic accepts CDB-based (command descriptor block) requests to XOR/zero/DMA regions of the transfer buffer and also accepts read/write slave requests. The CDB-base requests to perform certain operations are queued up and an interrupt is generated upon completion. In many operations, such as the regenerative or RMW operations discussed above, multiple blocks of data are combined in one or more XOR operations to obtain a block of parity data. The operations may be performed in a serial manner where each block is combined one at a time. It is preferable, however, to perform the operations in parallel where multiple requests are submitted simultaneously. To achieve parallelism the buffer must be cleared before the operations are performed since otherwise unknown initial contents of the memory would be XOR""d with incoming data resulting in unknown data.
Portions of the transfer buffer must be allocated for subsequent disk operations. Also, to ensure that the allocated portions of the transfer buffer are cleared prior to one or more XOR operations, a ZERO MEMORY CDB command had to be issued and completed before subsequent disk I/O commands were performed. Such allocation and/or zeroing, if necessary, are part of the real-time xe2x80x9cgarbage collectionxe2x80x9d tasks that are performed in the transfer buffer. The ZERO MEMORY CDB command, however, added a significant amount of overhead in the CDB-based command traffic stream which slowed disk transfer operations. It is desired to provide a method and system to reduce the number of queued commands that must be serviced by the array controller during disk drive operations.
A controller according to the present invention cleans buffer memory as a background task. The controller includes a transfer buffer, a memory that stores an index or table indicating free and non-zero data sectors within the transfer buffer, and processing logic that uses the transfer buffer for data transfer operations, and when otherwise idle, that scans the index table for contiguous sections of free and non-zero data sectors of the transfer buffer and that zeroes at least one of the contiguous sections. In this manner, the controller is more likely to find an appropriate size buffer of free and zeroed data sectors in the transfer buffer to perform parallel logic operations to generate new parity information. The present invention significantly reduces or relieves the controller from having to issue CDB-based memory commands to zero or clean an allocated buffer for performing disk transfer operations. Thus, the controller performs disk I/O operations faster and more efficiently.
The processing logic may include a processor and the controller memory may store software for execution by the processor, where the software includes buffer allocation routines for allocating buffers within the transfer buffer. The software may be in the form of firmware stored in a read only memory (ROM) or the like. The firmware may further include an idle task that scans the index table for the contiguous sections of free and non-zero data sectors and that zeroes at least one of the contiguous sections. The buffer allocation routines may further include a get routine that allocates a block of memory space from free and zeroed sectors within the transfer buffer. The get routine may include at least one input parameter to indicate buffer allocation requirements, and provide an output status to indicate success of the buffer allocation according to the requirements. The processing logic may further include a memory controller coupled to the transfer buffer via a multithreaded interface that performs simultaneous exclusive-OR logic operations into a single allocated buffer within the transfer buffer.
A computer system according to the present invention includes a drive array that stores data and corresponding parity data, a main memory, a processor that generates and stores data in the main memory and that sends a logical request to transfer the stored data to the drive array and an array controller that receives the logical request and that transfers the stored data to the drive array. The array controller further includes a transfer buffer, a local memory that stores an index indicating free data sectors and non-zero data sectors within the transfer buffer and processing circuitry that receives the logical request, that transfers the stored data to the transfer buffer, that combines the stored data with corresponding data from the drive array in a parallel operation to generate new parity data and that stores the data and new parity data to the drive array. When the array controller is otherwise idle, the processor scans the index for free and non-zero sections in the transfer buffer and then zeroes data sectors of at least one of the free and non-zero sections.
A method of cleaning a transfer buffer memory of a disk controller according to the present invention includes detecting an idle mode of the controller, searching an index for free and non-zero sections within the transfer buffer, and zeroing the contents of at least one contiguous free and non-zero section within the transfer buffer. The detecting may further comprise detecting when a processor of the array controller is executing an idle task. The searching may comprise searching from a beginning of the transfer buffer and the zeroing may comprising zeroing a first contiguous free and non-zero section within the transfer buffer from the beginning. Alternatively, the method may further comprise periodically repeating the detecting, searching and zeroing, and after each zeroing, setting a pointer to indicate a location within the transfer buffer after the contiguous free and non-zero section that was zeroed. Then, the searching comprises searching from the pointer previously set. The method may further comprise updating the index after each zeroing or cleaning of a section. The method may further comprise selecting one of a plurality of free and non-zero sections within the transfer buffer, such as selecting a free and non-zero section that would result in the largest contiguous free and zero section within the transfer buffer after zeroing.
It is now appreciated that a method and apparatus for zeroing a transfer buffer memory as a background task according to the present invention reduces and possibly eliminates the need to execute CDB-based commands or any other similar commands to clean buffer memory in response to a logical request by a computer to transfer data to a drive array. In this manner, the array controller is more likely to find an appropriate size buffer of free and zero data sectors in the transfer buffer for performing parallel XOR operations to generate new parity information. Thus, the controller operates faster and more efficiently.