Storage driver sub-systems are designed to store (write) and retrieve (read) user data from permanent storage media. Driver sub-systems, however, cannot always complete input/output (I/O) operations. In some instances, the driver sub-system can fail to complete a requested I/O operation after a given number of unsuccessful attempts to read or write data. Other systems, though, cannot recognize a failure or choose to ignore failures during I/O operations. In these instances, the driver sub-system may continually or indefinitely attempt the read or write operation.
To mitigate the impact of failed storage devices and failed I/O operations, layers of the storage I/O stack have been developed to add redundancy. One way to mitigate the effect of failed storage devices is to write copies of the data in multiple locations. In this way, if an I/O failure were to occur on any given part of the data, the data is still available in another location. This layer of the I/O stack is generally known as a Logical Storage Manager (LSM), or Logical Volume Manager (LVM). Hardware has also been developed to provide this redundancy. These storage devices are known as Redundant Array of Independent (or Inexpensive) Disks controller, RAID controllers.
Control of an I/O operation and any associated error recovery passes from layer to layer with the request to perform the operation. Currently, when an I/O operation is issued from the LSM/LVM layer to the underlying device driver, the device driver is in control of the I/O operation, any error recovery, or retries. The same is true when the device driver issues the I/O to the device itself (including a RAID controller); the device then is in control of the I/O, any error recovery, or retries.
In the case of Small Computer System Interface (SCSI) devices, a method exists to establish the error recovery/retry methodology of a SCSI storage device. This method is performed through the operating system via the SCSI Error Recovery Mode page. Many storage devices and operating systems, however, do not fully implement this method. In addition, if used on a per I/O basis, this method would substantially degrade performance.
Applications desire that their I/O requests be completed as quickly as possible. When error recovery is invoked by the underlying hardware or software device driver, that recovery may unnecessarily delay the completion of the I/O operation of the application. For example, in a mirrored situation, when one unit begins to fail, switching to the alternate data set may be preferred to initiating error recovery on the unit that is beginning to fail. However, since layered protocol stacks generally prohibit boundary crossing, it is not desirable for the underlying device driver, or even possible for the hardware, to know when it should do error recovery, and when it should not.
In one solution, the underlying device driver offers timers to the higher layers. The LSM/LVM layer decides the amount of time each I/O operation is allowed and specifies this time to the lower layer (where the actual timing occurs). The device driver then completes the I/O within that time. The device driver either returns the data successfully, or the I/O is terminated, and a failure is returned. This solution, however, adds significant complexity to the device driver layers. The device driver must determine how long it takes to terminate an I/O operation (even determining the exact value to use may not be possible when using some I/O protocols), and subtract the determined time from the user's requested time limit. For example, if the LSM/LVM layer were to request a 20 second time limit but the device driver takes 5 seconds to terminate an I/O, then only 15 seconds can be used to perform the I/O. After 15 seconds, the I/O is terminated, and then prior to the specified 20 second limit, the I/O can be returned with the failing status.