An enterprise-level virtualized computing environment may include multiple host computers, each running a hypervisor that abstracts the host computer's processor, memory, storage, and networking resources into logical resources for multiple virtual machines running on the host computer. The computing environment also may include networked storage in the form of one or more disk arrays and controllers, in order to consolidate data storage for the virtual machines. The hypervisor handles I/O for the multiple virtual machines (VMs) by sending data to (and retrieving data from) logical disks created on the disk arrays. As further detailed below, a hypervisor in a host computer that supports “multipath I/O” functionality in this type of virtualized computing environment is able to direct I/O requests through one of a number of storage I/O paths from a particular virtual machine to a particular logical disk. This multipath I/O functionality provides, for example, failover capability (e.g., an alternate path can be selected in the case of failure of a controller, port, or switch on the current path between a host computer and a disk array), as well as load balancing capability (i.e., distribution of storage I/O traffic between available paths). Multipath I/O may provide other various benefits and advantages, such as fault tolerance, decreased latency, increased bandwidth, and/or improved security.
FIG. 1A depicts an example enterprise-level virtualized computing environment providing conventional multipath I/O. In the example environment 100, each host computer 110 includes one or more host bus adapters (HBA) 116, wherein each HBA 116 has one or more ports 117. In the embodiment of FIG. 1A, each port 117 enables a host computer 110 to connect to one of two switches 120A or 120B (e.g., Fibre Channel switches, etc.), which, in turn, are connected to ports 138A or 138B of one of controller 136A or 136B for a storage area network (SAN) 130. As depicted in FIG. 1A, each host computer 110 has a number of different paths (e.g., combinations of host computer ports, switches, and storage controller ports) through which I/O communication can reach SAN 130.
Each host computer 110 may run a hypervisor 112, such as, for example, the vSphere Hypervisor from VMware, Inc. (“VMware”), that enables such host computer 110 to run a number of VMs. A VM running on host computer 110 may access and perform I/O on a “virtual disk” that is stored as a file on one of LUNs 134 exposed by SAN 130. Each hypervisor 112 includes a multipathing module (MPM) 114, which enables the hypervisor to direct outgoing storage I/O traffic in accordance with appropriate path selection policies and also chooses alternative paths for I/O if a current path fails. For example, VMware's Pluggable Storage Architecture (PSA) in vSphere Hypervisor includes a Native Multipathing (NMP) module (NMP) that enables an administrator to specify a path selection policy to use when host computer 110 performs I/O on behalf of ones of its VMs. For example, the policy may specify fixed preferred path which is used to perform I/O so long the preferred path remains available. If the preferred path fails, the NMP module may select an alternative path but return to the preferred path once it is restored. Alternatively, the policy may specify a “most recently used” (MRU) path. With an MRU policy, even if a prior path is restored after a failure, the chosen alternative path remains the path that continues to be used since it then is the most recently used path. Another alternative policy may be a “round robin” path selection policy in which hypervisor 112 continually rotates through all available storage I/O paths for load balancing purposes.
To date, current multipathing modules implement path selection policies such as the above fixed, MRU and round robin path selection policies, are “static” in nature and do not take into account changes in dynamic conditions within environment 100. For example, as shown in FIG. 1A, if both of hypervisors 112 are individually performing load balancing between switches 120 using the round robin path selection policy without any consideration of load or I/O throughput at the switches, it is entirely possible that, despite the fact that storage I/O traffic from the hypervisors on host computer 110A has already heavily loaded switch 120A, the MPM 114 for the hypervisor 112B may still select a path to SAN 130 through switch 120A, rather than selecting an alternate path to SAN 130 through lightly-loaded switch 120B.