1. Field of the Invention
The invention relates generally to clustered storage systems and more specifically relates to methods and structure for task management among a plurality of storage controllers in a clustered storage system.
2. Related Patents
This patent application is related to the following commonly owned United States patent applications, all filed on the same date herewith and all of which are herein incorporated by reference:                U.S. patent application Ser. No. 13/432,213, entitled METHODS AND STRUCTURE FOR IMPROVED PROCESSING OF I/O REQUESTS IN FAST PATH CIRCUITS OF A STORAGE CONTROLLER IN A CLUSTERED STORAGE SYSTEM;        U.S. patent application Ser. No. 13/432,223, entitled METHODS AND STRUCTURE FOR LOAD BALANCING OF BACKGROUND TASKS BETWEEN STORAGE CONTROLLERS IN A CLUSTERED STORAGE ENVIRONMENT;        U.S. patent application Ser. No. 13/432,225, entitled METHODS AND STRUCTURE FOR TRANSFERRING OWNERSHIP OF A LOGICAL VOLUME BY TRANSFER OF NATIVE-FORMAT METADATA IN A CLUSTERED STORAGE ENVIRONMENT;        U.S. patent application Ser. No. 13/432,232, entitled METHODS AND STRUCTURE FOR IMPLEMENTING LOGICAL DEVICE CONSISTENCY IN A CLUSTERED STORAGE SYSTEM;        U.S. patent application Ser. No. 13/432,238, entitled METHODS AND STRUCTURE FOR IMPROVED I/O SHIPPING IN A CLUSTERED STORAGE SYSTEM;        U.S. patent application Ser. No. 13/432,220, entitled METHODS AND STRUCTURE FOR MANAGING VISIBILITY OF DEVICES IN A CLUSTERED STORAGE SYSTEM;        U.S. patent application Ser. No. 13/432,150, entitled METHODS AND STRUCTURE FOR IMPROVED BUFFER ALLOCATION IN A STORAGE CONTROLLER; and        U.S. patent application Ser. No. 13/432,138, entitled METHODS AND STRUCTURE FOR RESUMING BACKGROUND TASKS IN A CLUSTERED STORAGE ENVIRONMENT.        
3. Discussion of Related Art
In the field of data storage, customers demand highly resilient data storage systems that also exhibit fast recovery times for stored data. One type of storage system used to provide both of these characteristics is known as a clustered storage system.
A clustered storage system typically comprises a number of storage controllers, wherein each storage controller processes host Input/Output (I/O) requests directed to one or more logical volumes. The logical volumes reside on portions of one or more storage devices (e.g., hard disks) coupled with the storage controllers. Often, the logical volumes are configured as Redundant Array of Independent Disks (RAID) volumes in order to ensure an enhanced level of data integrity and/or performance.
A notable feature of clustered storage environments is that the storage controllers are capable of coordinating processing of host requests (e.g., by shipping I/O processing between each other) in order to enhance the performance of the storage environment. This includes intentionally transferring ownership of a logical volume from one storage controller to another. For example, a first storage controller may detect that it is currently undergoing a heavy processing load, and may assign ownership of a given logical volume to a second storage controller that has a smaller processing burden in order to increase overall speed of the clustered storage system. Other storage controllers may then update information identifying which storage controller presently owns each logical volume. Thus, when an I/O request is received at a storage controller that does not own the logical volume identified in the request, the storage controller may “ship” the request to the storage controller that presently owns the identified logical volume.
FIG. 1 is a block diagram illustrating an example of a prior art clustered storage system 150. Clustered storage system 150 is indicated by the dashed box, and includes storage controllers 120, switched fabric 130, and logical volumes 140. Note that a “clustered storage system” (as used herein) does not necessarily include host systems and associated functionality (e.g., hosts, application-layer services, operating systems, clustered computing nodes, etc.). However, storage controllers 120 and hosts 110 may be tightly integrated physically. For example, storage controllers 120 may comprise Host Bus Adapters (HBA's) coupled with a corresponding host 110 through a peripheral bus structure of host 110. According to FIG. 1, hosts 110 provide I/O requests to storage controllers 120 of clustered storage system 150. Storage controllers 120 are coupled via switched fabric 130 (e.g., a Serial Attached SCSI (SAS) fabric or any other suitable communication medium and protocol) for communication with each other and with a number of storage devices 142 on which logical volumes 140 are stored.
FIG. 2 is a block diagram illustrating another example of a prior art clustered storage system 250. In this example, clustered storage system 250 processes I/O requests from hosts 210 received via switched fabric 230. Storage controllers 220 are coupled for communication with storage devices 242 via switched fabric 235, which may be integral with or distinct from switched fabric 230. Storage devices 242 implement logical volumes 240. Many other configurations of hosts, storage controllers, switched fabric, and logical volumes are possible for clustered storage systems as a matter of design choice. Further, in many high reliability storage systems, all the depicted couplings may be duplicated for redundancy. Additionally, the interconnect fabrics may also be duplicated for redundancy.
While clustered storage systems provide a number of performance benefits over more traditional storage systems described above, the speed of a storage system still typically remains a bottleneck to the overall speed of a processing system utilizing the storage system.
In such a clustered storage system, shipping of I/O requests from one controller to another presents numerous coordination and synchronization issues. For example, the controller that initially receives an I/O request (that will be shipped) acts in the role of a target device (e.g., in a SCSI protocol transfer) in receiving the request from an attached host system but acts in the role of an initiator device when shipping the request to another storage controller (the target device of the shipped request). Both the initiator and target storage controllers may utilize various portions (e.g., “layers”) of their respective control logic in processing such a shipped I/O request. For example, a logical or physical device management layer may be utilized within the target storage controller to process the received, shipped I/O request. A “lower” layer for protocol management may be utilized within the target storage controller in communicating with the storage devices to be accessed. This protocol layer (e.g., another instance of the protocol layer) may also be utilized in communicating with the initiator storage controller to exchange data associated with a shipped request between the target controller and the host system. In like manner, similar layers of the initiator storage controller will be utilized in conjunction with the target controller and the host system. For example, protocol layers of the initiator controller may be involved in the exchange of data associated with the shipped request between the target storage controller and the host system. Or, for example, other layers of the initiator controller may await completion information from the target controller to report the completion status back to the requesting host system.
The initiator and target storage controllers generally operate independently of one another. Further, the various processing layers within each storage controller may also operate largely independent of one another as various requests may be in process within a controller at any given time. Thus, a number of coordination and synchronization issues arise in such a context.
One particularly vexing synchronization problem arises in the context of aborting one or more such shipped I/O requests. If a shipped request is to be aborted (for any of a variety of reasons), it is difficult to synchronize potentially several layers of processing in both the initiator and target storage controllers to properly abort the I/O request and to release resources utilized by both storage controllers in processing the aborted, shipped I/O request.
Thus it is an ongoing challenge to manage the aborting of I/O requests in the context of a clustered storage system where I/O requests may be shipped among the various storage controllers of the clustered storage system.