1. Field of the Invention
This invention relates to the replacement of a storage device in a serial array of storage devices and, more particularly, to the replacement of such a device while the computer to which the array is attached continues to make requests to the array for the storage or retrieval of data.
2. Description of Related Art
It is well known to construct a data storage device using a plurality of disk drives. Such a configuration is typically referred to as a RAID, meaning a redundant array of inexpensive drives.
The advantages of such a configuration are also well known. The size of the storage device can be incrementally and inexpensively increased by simply adding another disk drive. The speed of data and retrieval is also markedly enhanced by distributing the data over several drives in the array and operating those drives simultaneously or nearly simultaneously.
On the other hand, a storage array containing a plurality of drives has far more moving parts than a single large drive, potentially increasing the likelihood of a failure.
In many embodiments, each drive in the array is connected in parallel to a common bus. When one drive fails, the remaining drives are often still able to fully satisfy data requests from the host computer to which the array is attached. This is attributable to the creation and recordation of parity data. If the failed drive was merely storing parity data, that data is redundant and is not needed, if the other drives are still functioning. Conversely, if the failed drive was storing data, the missing data can be recreated from the data and parity information on the remaining drives. As a result, a malfunction in one of the drives in an array of drives connected in parallel does not usually interrupt the data operations of the host computer to which the array is attached.
Of course, it is highly desirable to replace the malfunctioning drive as soon as possible. Otherwise, a later error in one of the other drives will not be able to be corrected.
A process known as xe2x80x9chot swappingxe2x80x9d is often used to replace a malfunctioning drive in a parallel array. While the host computer continues to send data requests to the array, the malfunctioning drive is removed and replaced by a new functioning drive. The array then returns to its normal function of storing and retrieving data, with parity protection. Significantly, the operations of the host computer are not disrupted throughout the entire process.
The consequences are usually far different if the drives in the array are serially connected. In some configurations, the data must flow into and out of the connection to one drive, before it can be received by some or all of the remaining drives. This is known as a serial connection. This exists in connection with several drive topologies, such as in connection with the SSA architecture, fiber channels, and drive configurations in which the data signal must pass through connectors mounted in each drive.
When an array of drives are serially connected, a malfunction in a single drive still does not usually disrupt the operation of the host computer. If it were storing parity data, the data is still redundant and not needed. If it were storing data, the missing data can still be recreated from the data and parity information on the other drives.
Unless the malfunctioning drive is the last drive in the serial array, however, its replacement will usually cause significant disruption in the operation of the host computer. As soon as it is removed, its removal will prevent at least one other drive in the array from receiving and processing data requests. Since most arrays are not able to continue functioning without two or more of their drives, an error message will be immediately sent to the host computer. In turn, this error message will often initiate error recovery routines in the host computer and, ultimately, the cessation of data storage and retrieval operations in the host computer. In some systems, such an error will ultimately stop the host computer from processing, requiring the computer in some systems to be completely rebooted. Not only is valuable time lost, but damage to data can occur.
In short, the replacement of a malfunctioning drive in an array of drives that are serially connected often interferes materially with the operation of the host computer to which the array is attached, which is often highly undesirable.
The invention allows a storage device in an array of storage devices connected in series to be replaced without interfering with the operation of the host computer to which the array is attached.
In one embodiment of the invention, the invention provides a hot replacement process for replacing a to-be-replaced data storage device in an array of data storage devices serially connected to one another while the array continues to receive requests for the storage or retrieval of data from a host computer to which the array is connected. The process includes: buffering the requests while the to-be-replaced storage device is being replaced; deferring the processing of the requests while they are being buffered; replacing the to-be-replaced storage device with another storage device while the requests are being buffered; and processing the buffered requests after the to-be-replaced storage device has been replaced by the other storage device.
Another embodiment of the invention includes: the process of signaling that the to-be-replaced storage device is about to be replaced; completing any request that is in the process of being fulfilled at the time of the signaling; after any request that was in the process of being fulfilled at the time of the signaling has been completed, initiating the buffering of the requests and the deferring of the processing of the requests; and signaling that it is safe to remove the to-be-replaced storage device from the array for replacement.
The invention contemplates that the foregoing processes will be used in connection with some or all of the requests that are generated by the host computer to which the array is attached.
In a still further embodiment of the invention, a software driver is used to effectuate a portion of the process. In one embodiment, the software driver is loaded in the host computer. In another embodiment, the software driver is loaded in the storage array. The invention contemplates that the process be used in connection with storage arrays containing hard disk drives, as well as other types of storage devices.
Another embodiment of the invention includes a data and storage and retrieval system. The system includes: an array of data storage devices serially connected to one another for receiving requests for the storage or retrieval of data from a computer; a buffer for buffering the requests from the computer; and a processor associated with the buffer and the array for causing the buffer to buffer the requests from the computer while one of the storage devices is being replaced, causing the processing of the requests to be deferred while they are being buffered, and causing the buffered requests to be processed after the storage device has been replaced.
The invention contemplates that the processor will buffer and defer some or all of the requests that are being received from the computer while the storage device is being replaced.
Another embodiment of the invention includes a replace device notifier for notifying when it is safe to remove the storage device from the array. In this embodiment, the processor further receives a signal that the storage device that is about to be replaced, causes any request that is in the process of being fulfilled at the time of the signaling to be completed and, after any request that was in the process of being fulfilled at the time of the signaling is completed, initiates the buffering of the requests by the buffer and the deferring of the processing of the requests, and causes the notifier to notify that it is safe to remove the storage device from the array.
In a still further embodiment of the invention, the processor includes a software driver, configured to be loaded in the computer and/or in the array.
The invention is also applicable to the replacement of any component in a data storage device while the device continues to receive requests for the storage or retrieval of data from a host computer to which the device is connected. While the to-be-replaced component is being replaced, the requests are buffered and their processing is deferred. The component is replaced while the requests are being buffered and the buffer requests are processed after the to-be-replaced component has been replaced.
A still further embodiment of the invention includes a data storage and retrieval system that includes a data storage device for receiving requests for the storage or retrieval of data from a computer, and the storage device includes a component necessary for the operation of the device. It further includes a buffer for buffering the requests from the computer while the device is being replaced, and a processor associated with the buffer in the storage device for causing the buffered requests to be processed after the component has been replaced.
These as well as still further features, objects and benefits of the present invention will now become clear upon an examination of the attached drawings and the following description of the preferred embodiments.