1. Field of the Invention
This invention relates in general to computer storage systems, and more particularly to a quiesce system storage device and method in a dual active controller with cache coherency using stripe locks for implied storage volume reservations.
2. Description of Related Art
Increasingly, there is a need to provide access to stored information or data on hard disk drives (or other storage devices) from a plurality of host servers and to also permit the data stored on any particular storage device to be accessed through alternative device controllers. Providing access to the data from multiple hosts eliminates the need to store the data at more than one location (though the data may still be redundantly stored using known mirroring or Redundant Array of Independent Disk (RAID) techniques) and in theory assures that the identical data can be accessed by interested parties. Providing multiple access to a storage device through a plurality of controllers, provides redundant access to the device from an alternate (or second) controller so that the data remains accessible in the event that the first controller fails.
Although providing access to storage devices through multiple controllers is desirable, such a configuration may present data consistency problems. Data consistency refers to all controllers providing visibility to one identical copy of the data. Data consistency can be provided through data synchronization or data coherency or both. Data coherency refers to maintaining a consistent copy of the data in each of the controllers caches. Data synchronization refers to keeping the data in the storage controller""s cache the same as that in the storage device.
Storage controllers direct data traffic from the host system to one or more non-volatile storage devices. Storage controller may or may not have an intermediary cache to stage data between the non-volatile storage device and the host system. A caching controller (or caching storage controller) is a device which is capable of directing the data traffic from a host system to one or more non-volatile storage devices which uses an intermediary data storage device (the cache memory) to stage data between the non-volatile storage device and the host system. In general, the intermediary storage device is built out of RAM to allow a quicker access time to the data. Multiple active controllers are defined as a collection of storage controllers or caching storage controllers which work in a cooperative manner with each other.
Multiple active controllers provide the ability for recovering from a controller failure by allowing multiple paths to a storage volume. The storage volume is a contiguous range of randomly accessible sector of data. For practical purposes, the sector numbering starts at 0 and goes to N, where N is the total number of sectors available to the host system. A data extent is a range of data within a storage volume delineated by a starting sector and an ending sector. The storage volume is broken up into a number of data extents which are not required to be of equivalent sizes, but may not overlap. These concepts are used in the discussion of the background and the detailed description of embodiments of the invention, and apply to both.
Caching storage controllers that work independently of one another to store information or data to a secondary storage unit, such as a hard disk drive, or tape unit, are conventionally available. There are also caching storage controllers that work with one or more other controller(s) to provide multiple controller access to a secondary storage unit and provide a fault tolerant environment. If two controllers are simultaneously providing access to a common set of storage devices and each is able to take over the other""s functionality in the event of a failure, then those controllers are referred to as active-active or dual-active controllers.
Computer system configurations involving one or more host computers and having two or more controllers that use cache technologies, with access to a storage device through any of the controllers, should desirably provide some mechanism of ensuring that the cache data in each controller is always correct. Unfortunately in conventional systems they may not. Controllers using the SCSI command set could use two commands that are provided in that command set, the xe2x80x9cReserve LUNxe2x80x9d and xe2x80x9cRelease LUNxe2x80x9d commands, where LUN is an abbreviation for Logical Unit Number. (SCSI commands, including the Reserve LUN and Release LUN commands, are described in standard references including SCSI-2 Small Computer System lnterface-2 ANSI X3.131:1994: which is incorporated herein by reference.) The host computer, especially one operating in a cluster environment, could use these two commands to reserve all accesses to the entire LUN.
Unfortunately, not all host computers use these SCSI commands. Furthermore, the Reserve LUN and Release LUN commands do not provide for reservation of a portion of a storage volume because they apply to the entire storage volume.
In addition, there are problems with data synchronization between controllers which maintain local cached copies of data. For example, one set of host transactions could cause data integrity problems (data consistency and data synchronization problems). The data consistency problem is brought about by the fact that each controller""s cache operates independently.
One technique for overcoming the data consistency problems involves a storage volume reservation system and method as described in co-pending U.S. patent application Ser. No. 09/325,033, which is hereby incorporated by reference. The storage volume (or storage LUN) reservation system for active controllers in an environment allows data access through two or more separate caching controllers. The inventive structure and method maintains a xe2x80x9creservation tablexe2x80x9d (such as a LUN reservation table) that is always consistent on each of the plurality of controllers. This structure and method also provide the capability of explicitly reserving storage volumes using any current storage volume (or LUN) reserve commands, or implicitly using a write operation. The inventive structure and method also provide the capability of invalidating a controller""s cache based on acquiring new reservation.
The storage volume reservation system and method provide that each controller is not required to reserve a storage volume in order to perform an update to that storage volume. An explicit reservation may be made through the use of Storage Volume Reserve commands, while an implicit reservation is made whenever a write operation requires that the particular controller obtain a reservation. Implicit reservations may occur for example when an alternate controller already owns the reservation. The reservation may also be obtained implicitly when the controller is required to perform a read operation, and the alternate controller already owns the reservation. This reservation requirement is imposed in order to ensure that the alternate controller""s cache contains no data (dirty data) that has not been synchronized with the storage volume drive.
The reservation process is synchronized between all of the controllers in the system in order to maintain reservation table coherency. All updates to the reservation table are propagated to the alternate controllers to maintain reservation table coherency. This procedure allows most reads and writes to be performed with minimal overhead. An I/O operation to a storage volume that is reserved by that controller only needs to check for that ownership before processing the I/O operation request.
The reservation process also benefits from cache flushing and cache invalidating in some circumstances to maintain data integrity. Obtaining a reservation to a storage volume or portion thereof that is not owned by any controller is straightforward and only requires an update to the reservation table, and it""s propagation to all alternate controllers. Obtaining a reservation to a storage volume or portion thereof currently owned by an alternate active controller further requires that the alternate controller flush and invalidate all cache data associated with that storage volume. Releasing a reservation is not required but may optionally be performed using storage volume release command.
Sometimes it is necessary to stop all processes associated with a system storage device. A system storage device quiesce is the process of stopping all activity for a given system storage device and release all resources used by that system storage device. For example, a storage device may need to be quiesced when xe2x80x9ctimeoutxe2x80x9d or xe2x80x9cdisk not respondingxe2x80x9d errors are returned due to overheating or failure of the disk.
To perform a quiesce on both controllers of a dual active pair while maintaining data integrity, and without requiring the operator to start the process, all IO from the host to the system storage device must be stopped. The controller will return busy to all IO requests for the system storage device. Next, a check to determine that no rebuilds, parity checks, or initializations are active for the system storage device is performed. If any are active, the system storage device cannot be quiesced. The firmware must then wait for all active IO to complete for this system storage device. After all active IO has completed, all dirty cache lines to the disk must be flushed for this system storage device. Next, all data is invalidated in the cache for this system storage device. Data in the mirror cache for this system storage device is also invalidated. Finally, the firmware must wait for the other controller of the dual active pair to finish these activities. All other system storage devices are allowed to continue to process IO.
However, this process presents a problem in determining how to find and flush dirty cache data belonging to a specific system storage device. Up to now, all of cache is scanned to look for any link that is associated with the given system storage device and the identified lines are scheduled to be flushed. Then, the quiesce mechanism must scan again to see if the flush has completed. This process is continued until all dirty cache data belonging to the storage device has been flushed to disk.
This approach requires a large amount of time to scan a large cache because there may be thousands of cache lines. A quiesce of a system storage device must be able to be initiated and finished during vendor unique command processing.
It can be seen that there is a need for a quiesce system storage device and method in a dual active controller with cache coherency using stripe locks for implied storage volume reservations.
To overcome the limitations in the prior art described above, and to overcome other limitations that will become apparent upon reading and understanding the present specification, the present invention discloses a quiesce system storage device and method in a dual active controller with cache coherency using stripe locks for implied storage volume reservations.
The present invention solves the above-described problems by using a stripe lock mechanism to find and flush cache data associated with the given system storage device. Dirty cache lines are associated with a stripe lock, and that stripe lock will be in the active state or the clearing state.
A method in accordance with the principles of the present invention includes associating dirty cache lines with a stripe lock, the stripe lock representing cache lines within data extents that are part of a system storage device associated with the dirty cache lines, maintaining the stripe locks on a linked list for the system storage device, setting stripe locks for the system storage device to be quiesced to a clearing state and flushing cache lines associated with the system storage device to be quiesced that are set for clearing.
Other embodiments of a method in accordance with the principles of the invention may include alternative or optional additional aspects. One such aspect of the present invention is the method further includes determining whether cache lines associated with the system storage device have been flushed by analyzing whether the stripe lock linked list is empty, halting the system storage device when cache lines associated with the system storage device have been flushed and examining the stripe lock linked list and continuing to flush cache lines set for clearing until the stripe lock linked list is empty.
In another embodiment of the present invention, a method for quiescing a system storage device is disclosed. The method for quiescing a system storage device includes stopping IO from a host to a system storage device that is to be quiesced, checking the system storage device to determine that no rebuilds, parity checks, or initializations are active, wait for active IO to complete for the system storage device to be quiesced, after active IO has completed for the system storage device to be quiesced, flushing dirty cache lines to the system storage device, invalidating data in the cache for the system storage device to be quiesced, invalidating data in the mirror cache for the system storage device to be quiesced and repeating the above steps for controllers, wherein the flushing cache lines associated with a storage device to be quiesced, further includes associating dirty cache lines with a stripe lock, the stripe lock representing cache lines within data extents that are part of a system storage device associated with the dirty cache lines, maintaining the stripe locks on a linked list for the system storage device, setting stripe locks for the system storage device to be quiesced to a clearing state and flushing cache lines associated with the system storage device to be quiesced that are set for clearing.
Another aspect of the present invention is that the method further includes continuing to process IO for other system storage devices.
Another aspect of the present invention is that the method further includes determining whether cache lines associated with the system storage device have been flushed by analyzing whether the stripe lock linked list is empty, halting the system storage device when cache lines associated with the system storage device have been flushed and examining the stripe lock linked list and continuing to flush cache lines set for clearing until the stripe lock linked list is empty.
In another embodiment of the present invention, an article of manufacture comprising a program storage medium readable by a computer tangibly embodying one or more programs of instructions executable by the computer. The program performing a method for flushing cache lines associated with a storage device to be quiesced, wherein the method includes associating dirty cache lines with a stripe lock, the stripe lock representing cache lines within data extents that are part of a system storage device associated with the dirty cache lines, maintaining the stripe locks on a linked list for the system storage device, setting stripe locks for the system storage device to be quiesced to a clearing state and flushing cache lines associated with the system storage device to be quiesced that are set for clearing.
Another aspect of the present invention is that the article of manufacture further includes determining whether cache lines associated with the system storage device have been flushed by analyzing whether the stripe lock linked list is empty, halting the system storage device when cache lines associated with the system storage device have been flushed and examining the stripe lock linked list and continuing to flush cache lines set for clearing until the stripe lock linked list is empty.
In another embodiment of the present invention, a storage system is disclosed. The storage system includes a plurality of caching storage controllers and a plurality of storage devices coupled to the plurality of storage controllers, wherein the caching storage controllers are configured to flush cache lines within the storage controllers that are associated with a storage device to be quiesced, the caching storage controllers associating dirty cache lines with a stripe lock within the memory of the caching storage controllers, the stripe lock representing cache lines within data extents of a system storage device associated with the dirty cache lines, maintaining the stripe locks on a linked list for the system storage device, setting stripe locks for the system storage device to be quiesced to a clearing state and flushing cache lines set for clearing within the caching storage controllers that are associated with the system storage device to be quiesced.
Another aspect of the storage system of the present invention is that the storage controllers determine whether cache lines within the caching storage controllers associated with the system storage device have been flushed by analyzing whether the stripe lock linked list is empty, halt the system storage device when cache lines associated with the system storage device have been flushed and examine the stripe lock linked list and continuing to flush cache lines set for clearing until the stripe lock linked list is empty.
Another aspect of the storage system of the present invention is that the memory of the storage controllers further comprises a plurality of stripe lock records.
Another aspect of the storage system of the present invention is that the stripe lock records comprise an ownership field, a lock type field, an extent field, a use count field, a linked list of pending locks, a linked list of input/output processes (iop)s, a state field, and a lock wait count field.
Another aspect of the storage system of the present invention is that stripe lock records are maintain in storage separate from the controllers.
Another aspect of the storage system of the present invention is that the memory of each controller further comprises a storage volume ownership transaction data structure for tracking transactions between the controllers.
Another aspect of the storage system of the present invention is that the memory of each controller further comprises a lock request data structure for establishing a lock on a storage volume.
In another embodiment of the present invention, a storage controller is disclosed. The storage controller includes an input/output interface for permitting communication with a host computer and a mass storage system, a cache having a number of cache lines, some of which cache lines may include dirty data, a memory for maintaining a stripe lock and a processor, coupled to the memory and cache, the processor associating dirty cache lines with the stripe lock maintained within the memory, the stripe lock representing cache lines within data extents of a system storage device associated with the dirty cache lines, the processor maintaining the stripe locks on a linked list for the system storage device in the memory, setting stripe locks for the system storage device to be quiesced to a clearing state and flushing cache lines set for clearing that are associated with the system storage device to be quiesced.
Another aspect of the storage controller of the present invention is that the processor determines whether cache lines associated with the system storage device have been flushed by analyzing whether the stripe lock linked list is empty, halts the system storage device when cache lines associated with the system storage device have been flushed and examines the stripe lock linked list and continuing to flush cache lines set for clearing until the stripe lock linked list is empty.
Another aspect of the storage controller of the present invention is that the memory further comprises a plurality of stripe lock records.
Another aspect of the storage controller of the present invention is that the stripe lock records comprise an ownership field, a lock type field, an extent field, a use count field, a linked list of pending locks, a linked list of input/output processes (iop)s, a state field, and a lock wait count field.
Another aspect of the storage controller of the present invention is that stripe lock records are maintain in storage separate from the storage controller.
Another aspect of the storage controller of the present invention is that the memory further comprises a storage volume ownership transaction data structure for tracking transactions with other storage controllers.
Another aspect of the storage controller of the present invention is that the memory further comprises a lock request data structure for establishing a lock on a storage volume.
Another embodiment of a storage controller includes means for communicating with a host computer and a mass storage system, first memory means having a number of cache lines, some of which cache lines may include dirty data, second memory means for maintaining a stripe lock and means, coupled to the memory and cache, for associating dirty cache lines with the stripe lock maintained within the memory, the stripe lock representing cache lines within data extents of a system storage device associated with the dirty cache lines, for maintaining the stripe locks on a linked list for the system storage device in the memory, for setting stripe locks for the system storage device to be quiesced to a clearing state and for flushing cache lines set for clearing that are associated with the system storage device to be quiesced.
These and various other advantages and features of novelty which characterize the invention are pointed out with particularity in the claims annexed hereto and form a part hereof. However, for a better understanding of the invention, its advantages, and the objects obtained by its use, reference should be made to the drawings which form a further part hereof, and to accompanying descriptive matter, in which there are illustrated and described specific examples of an apparatus in accordance with the invention.