1. Field of the Invention
The invention relates generally to storage and computer system interconnections and more specifically relates to methods and structures to support SCSI-3 Persistent Reservations in a multiple-path storage/computing cluster.
2. Related Patents
This patent is related to commonly owned patent application Ser. No. 10/635,887 filed on Aug. 6, 2003 and entitled METHODS AND STRUCTURE FOR SCSI2 TO SCSI3 RESERVATION PROTOCOL MAPPING which is hereby incorporated by reference. This patent is also related to commonly owned patent application Ser. No. 09/960,827 filed on Sep. 21, 2001 and entitled METHOD AND APPARATUS FOR PROVIDING HIGHLY-TRANSPARENT, HOST-BASED MULTI-PATHING SUPPORT which is hereby incorporated by reference.
3. Discussion of Related Art
It is generally known in the art of high availability and mission critical computing that data loss or even temporarily unavailable data is unacceptable. In such critical environments, a computer host system may utilize multiple host bus adaptors to access data from storage devices that, in turn, provide high availability performance through use of redundant controllers each receiving data through a corresponding interface adapter. In such high availability and high performance environments, it is a goal that the risk of a single-point-of-failure is eliminated. Data previously stored in the storage array is still accessible in case of hardware component, connection, or cable failures.
A host system in FIG. 1 can access the same data volume of the storage array via multiple data paths. SCSI (small computer system interface) is a standard that has been widely applied to interconnections between host systems and storage devices. SCSI-3 is the common version of the SCSI standards presently utilized in such connections. The SCSI standards define a command and status exchange protocol as well as some of the many transport media over which such protocol exchanges may be conducted. Those of ordinary skill in the art are well versed in the SCSI standards including the SCSI-3 standards. In particular, the SCSI-3 SBC and SBC-2 standards define standard block commands typically supported by any storage device adhering to the SCSI-3 standards. Further, published copies of the various SCSI standards are readily available to the public including draft versions freely available at, for example, the “www.t10.org” web site.
Each data path in FIG. 1 is a combination of I_T nexus, in which I represents the initiator port of the host system and T represents the target port of the storage array. An initiator port, in general, initiates an exchange in the SCSI protocols. The exchange is addressed to a particular target device that responds to the initiator. Some devices can serve both as a target and as an initiator in different exchanges.
To the host system, a volume in the storage array appears as multiple storage devices (disks)—each device corresponding to an I_T nexus usable to reach the storage device. This creates difficulties for a host application to access storage devices. Since a volume (also frequently referred to as a logical unit or LUN) of a storage array appears as multiple storage devices (or device nodes), a host must somehow determine which of the possible multiple paths to use. For example in FIG. 2, a host system may include multiple host bus adapters (HBAs) operating as initiators such as IA, IB, IC and ID. Each of the initiator HBAs of the host is coupled through a SAN fabric to a storage device. An exemplary storage device may have multiple HBAs itself operating as target devices (TA, TB, TC and TD). Therefore, in the exemplary configuration of FIG. 2, a system may be confronted with sixteen possible paths between the host system and the storage device each represented by an I_T nexus as follows: IA_TA, IA_TB, IA_TC, IA_TD, IB_TA, IB_TB, IB_TC, IB_TD, IC_TA, IC_TB, IC_TC, IC_TD, ID_TA, ID_TB, ID_TC, and ID_TD. The host system must therefore select among these multiple logical paths and coordinate exchanges with the storage device via the multiple possible paths.
A multiple-path driver element is generally known to address this problem of selecting from among multiple paths to a storage device. A multiple-path driver, such as LSI Logic's Multiple-path-Proxy (MPP) driver architecture, creates a virtual storage device (or virtual data path) for each storage volume and hides the physical storage devices (or data paths) from the operating system and application programs. The physical data access paths are only visible to the MPP driver and the rest of the operating system and applications see only the virtual data paths. Details of the MPP driver architecture are discussed in LSI Logic's above identified, commonly owned, co-pending patent applications.
When an I/O request is directed to a virtual device defined by the MPP, the MPP driver selects a physical path to the storage device and routes the I/O request to the selected physical path to complete the IO request. If a selected physical data path fails or previously failed, the MPP driver may re-route I/O requests through any of the other available physical data paths to the volume.
The LSI Logic MPP driver's logical operation and relation to other elements of an application program and operating system (OS) is generally illustrated in FIG. 3. The MPP driver may be statically or dynamically linked with the operating system. Once so configured, all physical paths to storage array devices are hidden from the software system elements that are above the MPP driver layer as shown in FIG. 3. In particular, the OS and host application layers manipulate only virtual devices that are defined by the MPP driver as illustrated in FIG. 3.
The multiple-path environment MPP proxy driver addresses the problem of single-point-of-failure for accessing storage devices. If one physical path from a host system to a storage device fails, another physical path between the system and the storage device may be selected. Further, the MPP proxy driver may redefine or re-map the virtual paths to other physical paths to thereby hide the failure from the higher layers of the OS and application programs.
Use of multiple-paths between a host system and a storage device therefore improves both performance and reliability. Performance is improved in that the host may perform I/O load balancing by distributing the I/O requests through multiple physical paths. The operations may therefore be completed using parallel operation of multiple paths. Reliability is improved in that if a physical path fails, the host system (i.e., the MPP driver) may use other physical paths to access data.
In such “clustered” computing environments wherein one or more host systems each utilize multiple physical paths coupled to a shared storage device, it is important to coordinate the access to the storage device to assure mutual exclusivity of access to the storage device when needed. Without such coordination data may be corrupted and/or lost by I/O operations forwarded to the storage device on different physical paths. One common approach to protect the shared data is to use the SCSI-3 Persistent Reservation commands specified in SCSI-3 specifications (SPC3). This technique is sometimes referred to as “I/O fencing using SCSI-3 Persistent Reservation”. SCSI-3 Persistent Reservation commands provide for managing/controlling data access to shared storage devices. SCSI-3 Persistent Reservation commands allow a storage device to execute commands from a selected set of I_T nexuses (i.e., a defined set of associated initiator ports and target ports) and reject commands from I_T nexuses outside the selected set. The host systems and cooperating storage devices uniquely identify I_T nexuses using protocol specific mechanisms and parameters.
Application clients (i.e., application programs or the operating system) may add or remove I_T nexuses from the selected set using Persistent Reservation commands. The application clients must all cooperate in use of the reservation protocols and commands. If the application clients do not cooperate in the reservation protocol, data may be unexpectedly modified or lost and/or deadlock conditions may occur. The multiple clients use key values to identify a reservation to be managed. The storage device receives the key value in various Persistent Reservation commands and internally associates the supplied key value with the I_T nexus from which it is received.
As noted above, the essential function of an MPP driver is to make multiple physical paths to a storage device transparent to host applications and the operating system. The host applications and operating system are not aware of the existence of multiple physical paths and treat a storage device with multiple data access paths in essentially the same way as a classic single path storage device. However, as noted above, SCSI-3 Persistent Reservation commands manage reservations in accordance with identified initiator and target ports (I_T nexus). A reservation is owned by a reservation holder or holders identified by an associated physical path I_T nexus. Access permission to a logical unit or its extents (i.e., a portion of a storage device) is managed based on an initiator's registration key. Both a reservation holder and a registrant are associated with a particular I_T nexus. Registration and reservation keys are used and associated with an I_T nexus—i.e., associated with a particular physical path.
When an MPP driver is used in a host system, the MPP driver virtualizes multiple physical paths and presents a single virtual path per disk array volume. When an application program or operating system manages reservations using SCSI-3 Persistent Reservation command, it is not aware of the physical paths since they are virtualized by a multiple-path driver. Referring for example to FIG. 2, when an application client sends a SCSI-3 “Persistent Reservation Out” command with REGISTER service action code parameter to a virtual device for establishing a registration, the command will be sent via an arbitrary, selected physical access path, for example the IA_TA nexus or the IB_TD nexus. The application client will then send a “Persistent Reservation Out” command with a RESERVE service action code parameter to create a Persistent Reservation. The multiple-path driver may select another physical path to complete “reserve” command (such as the IB_TB nexus). Because the next selected path (i.e., IB_TB) is not registered as an I_T nexus permitted access, the storage device will reject the request with SCSI-3 Reservation Conflict status. In other words, the MPP driver element hides the physical paths from the application and operating system processes and therefore complicates the use of SCSI-3 Persistent Reservation protocols.
It is evident from the above discussion that an ongoing problem persists in utilizing SCSI-3 Persistent Reservation protocols and commands in conjunction with a multiple-path driver element in one or more host systems providing virtualized access to multiple physical paths between the host systems and a shared storage device.