Distributed computing systems are an increasingly important part of research, governmental, and enterprise computing systems. Among the advantages of such computing systems are their ability to handle a variety of different computing scenarios including large computational problems, high volume data processing situations, and high availability situations. Such distributed computing systems typically utilize one or more storage devices in support of the computing systems operations. These storage devices can be quite numerous and/or heterogeneous. In an effort to aggregate such storage devices and to make such storage devices more manageable and flexible, storage virtualization techniques are often used. Storage virtualization techniques establish relationships between physical storage devices, e.g. disk drives, tape drives, optical drives, etc., and virtual or logical storage devices such as volumes, virtual disks, and virtual logical units (sometimes referred to as virtual LUNs). In so doing, virtualization techniques provide system-wide features, e.g., naming, sizing, and management, better suited to the entire computing system than those features dictated by the physical characteristics of storage devices. Additionally, virtualization techniques enable and/or enhance certain computing system operations such as clustering and data backup and restore.
FIG. 1 illustrates a simplified example of a computing system 100. The members of the computing system 100 include host 130 and host 140. As members of computing system 100, hosts 130 and 140, typically some type of application, data, or file server, are often referred to “nodes.” Hosts 130 and 140 can be designed to operate completely independently of each other, or may interoperate to form some manner of cluster. Thus, hosts 130 and 140 are typically individual computer systems having some or all of the software and hardware components well known to those having skill in the art. FIG. 6 (described below) illustrates some of the features common to such computer systems. In support of various applications and operations, hosts 130 and 140 can exchange data over, for example, network 120, typically a local area network (LAN), e.g., an enterprise-wide intranet, or a wide area network (WAN) such as the Internet. Additionally, network 120 provides a communication path for various client computer systems 110 to communicate with hosts 130 and 140. In addition to network 120, hosts 130 and 140 can communicate with each other over a private network (not shown).
Other elements of computing system 100 include storage area network (SAN) 150 and storage devices such as tape library 160 (typically including one or more tape drives), a group of disk drives 170 (i.e., “just a bunch of disks” or “JBOD”), and intelligent storage array 180. As shown in FIG. 1, both hosts 130 and 140 are coupled to SAN 150. SAN 150 is conventionally a high-speed network that allows the establishment of direct connections between storage devices 160, 170, and 180 and hosts 130 and 140. SAN 150 can also include one or more SAN specific devices such as SAN switches, SAN routers, SAN hubs, or some type of storage appliance. Thus, SAN 150 is shared between the hosts and allows for the sharing of storage devices between the hosts to provide greater availability and reliability of storage. Although hosts 130 and 140 are shown connected to storage devices 160, 170, and 180 through SAN 150, this need not be the case. Shared resources can be directly connected to some or all of the hosts in the computing system, and computing system 100 need not include a SAN. Alternatively, hosts 130 and 140 can be connected to multiple SANs.
FIG. 2 illustrates in greater detail several components of computing system 100. For example, disk array 180 is shown to include two input/output (I/O) ports 181 and 186. Associated with each I/O port is a respective storage controller (182 and 187), and each storage controller generally manages I/O operations to and from the storage array through the associated I/O port. In this example, each storage controller includes a processor (183 and 188), a cache memory (184 and 189) and a regular memory (185 and 190). Although one or more of each of these components is typical in disk arrays, other variations and combinations are well known in the art. The disk array also includes some number of disk drives (logical units (LUNs) 191-195) accessible by both storage controllers. As illustrated, each disk drive is shown as an LUN which is generally an indivisible unit presented by a storage device to its host(s). Logical unit numbers, also sometimes referred to as LUNs, are typically assigned to each disk drive in an array so the host can address and access the data on those devices. In some implementations, an LUN can include multiple devices, e.g., several disk drives, that are logically presented as a single device.
FIG. 2 also illustrates some of the software and hardware components present in hosts 130 and 140. Both hosts 130 and 140 execute one or more application programs (131 and 141) respectively. Such applications can include, but are not limited to, database administration systems (DBMS), file servers, application servers, web servers, backup and restore software, customer relationship management software, and the like. The applications and other software not shown, e.g., operating systems, file systems, and applications executing on client computer systems 110 can initiate or request I/O operations against storage devices such as disk array 180. Hosts 130 and 140 also execute volume manager (133 and 143) which enables physical resources configured in the computing system to be managed as logical devices. An example of software that performs some or all of the functions of volume manager 330 is the VERITAS™ Volume Manager product provided by Symantec Corporation. Hosts 130 and 140 take advantage of the fact that disk array 180 has more than one I/O port using dynamic multipathing (DMP) drivers (135 and 145) as well as multiple host bus adaptors (HBAs) 137, 139, 147, and 149. The HBAs provide a hardware interface between the host bus and the storage network, typically implemented as a Fibre Channel network. Hosts 130 and 140 each have multiple HBAs to provide redundancy and/or to take better advantage of storage devices having multiple ports.
The DMP functionality enables greater reliability and performance by using path failover and load balancing. In general, the multipathing policy used by DMP drivers 135 and 145 depends on the characteristics of the disk array in use. Active/active disk arrays (A/A arrays) permit several paths to be used concurrently for I/O operations. Such arrays enable DMP to provide greater I/O throughput by balancing the I/O load uniformly across the multiple paths to the disk devices. In the event of a loss of one connection to an array, the DMP driver automatically routes I/O operations over the other available connections to the array. Active/passive arrays in so-called auto-trespass mode (A/P arrays) allow I/O operations on a primary (active) path while a secondary (passive) path is used if the primary path fails. Failover occurs when I/O is received or sent on the secondary path. Active/passive arrays in explicit failover mode (A/PF arrays) typically require a special command to be issued to the array for failover to occur. Active/passive arrays with LUN group failover (A/PG arrays) treat a group of LUNs that are connected through a controller as a single failover entity. Failover occurs at the controller level, and not at the LUN level (as would typically be the case for an A/P array in auto-trespass mode). The primary and secondary controller are each connected to a separate group of LUNs. If a single LUN in the primary controller's LUN group fails, all LUNs in that group fail over to the secondary controller's passive LUN group.
When DMP functionality is extended to support SAN attached disks and storage arrays, certain deficiencies can arise. The proliferation of storage arrays has placed higher demand on array supportability of DMP. Maturity of multipathing support in operating systems and third-party driver software has increased the need for and complexity of DMP coexistence with these products. Moreover, use of DMP in a SAN environment significantly changes the complexity of path management. The number of devices that can be connected to a host generally increases by one or two orders of magnitude. Similarly, the number of paths to a particular device is often greater than two, the number in basic DMP implementations. Both of these factors have contributed to a significantly longer recovery time when some error condition occurs.
With the larger number of path segments and devices in a given path between an application executing on a host computer system and target storage, the overall chance of failure somewhere in the path increases. Because DMP functionality is typically one of the lowest elements in the software stack (i.e., closest to the hardware), its responsiveness is important to maintaining system-wide high availability characteristics. Accordingly, improved systems, methods, software, and devices are needed to improve the error detection, recovery, and monitoring functions of DMP functionality.