A SAN or storage area network, an example of a storage network environment, is a network dedicated to enabling multiple applications on multiple hosts to access, i.e., read and write, data stored in consolidated shared storage infrastructures. SANs commonly use fiber channel for communications between components in the SAN. Some storage network environments, including network-attached storage (NAS) and hierarchical storage management (HSM), may use a network file system (NFS) for network communications. Traditional storage environments use Direct Attached Storage (DAS), which does not involve the use of a network. However, DAS environments may be combined with HSMs and/or NASs and/or SANs to create a storage network environment. In general, storage network environments may include one or more SANs, one or more HSMs, one or more NASs, and one or more DASs, or any combination thereof.
A SAN consists of SAN devices, for example, different types of switches, storage components, and physical servers or hosts, which are interlinked, and is based on a number of possible transfer protocols such as Fiber Channel and iSCSI. Each server is connected to a SAN with one or more network cards, for example, a Host Bus Adapter (HBA). Applications are stored as data objects on storage devices in storage units e.g. Logical Unit Numbers (LUNs). A data object generally comprises at least one of a volume, a datastore, and a file system. The storage device may be used to store data related to the applications on the host.
Enterprise SANs are increasingly supporting most of the business critical applications in enterprises. As a result, these SANs are becoming increasingly large and complex. A typical SAN in a Fortune 500 company may contain hundreds or thousands of servers and tens or hundreds of switches and storage devices of different types. The number of components and links that may be associated with the data transfer from each given application and one or more of its data units may increase exponentially with the size of the SAN. This complexity, which is compounded by the heterogeneity of the different SAN devices, leads to high risk and inefficiency. Changes to the SAN (which need to happen often due to the natural growth of the SAN) take a long time to complete by groups of SAN managers, and are error-prone. For example, in many existing enterprises a routine change (such as adding a new server to a SAN) may take 1-2 weeks to complete, and a high percentage of these change process (sometime as high as 30-40%) include at least one error along the way. It is estimated that around 80% of enterprise SAN outage events are a result of some infrastructure change-related event.
The complexity of storage network environments has recently been further complicated by the increasing adoption of virtual servers or virtual machines (VMs) as hosts within storage network environments. As disclosed in commonly-assigned U.S. patent application Ser. No. 12/283,163 filed on Sep. 9, 2008, the contents of which are incorporated herein in their entirety, a virtual server is a server with a virtualization layer which enable multiple separate virtual machines, each being a separate encapsulation of an operating system and application, to share the resources on that server while each VM is executing in exactly the same way as on a fully-dedicated conventional server. VMs can rapidly and seamlessly be shifted from one physical server to any other one in the server resource pool, and in that way optimally utilize the resources without affecting the applications. Such a virtualization of the physical servers, or virtualization of the storage network environment, allows for efficiency and performance gains to be realized. These gains may be realized in terms of service-level metrics or performance metrics, e.g., storage capacity utilization, server utilization, CPU utilization, data traffic flow, load balancing, etc. It is well known that the higher the number of VMs compressed onto a physical server, the greater the savings. A major benefit of VMs is their ability to stop, shift and restart on different physical servers or hosts. For each physical server or host that is retired in place of a virtual server, there is a corresponding reduction in power, space and cooling requirements. The number of network interface cards, network cables, switch ports, HBAs, fiber channel cables and fiber channel ports are all reduced. These cost reductions are significant, and when compounded with the performance and/or efficiency gains, allow for a much more well-managed storage network environment. In general, the goal of SAN administrators is to maximize resource utilization while meeting application performance goals. Maximizing resource utilization means placing as many VMs per physical server as possible to increase CPU, network, memory, SAN and storage array utilization.
In the recent past, companies have been adopting virtualization applications such as VMware™, Microsoft™ Virtual Server, NetApp SnapShot™, NetApp SnapMirror™, and XEN™. These applications reduce underutilization by enabling data center teams to logically divide the physical servers e.g. x86 servers or hosts into a single, dual, quad or even eight-way and above independent, securely-operating virtual server or virtual machine (VM) systems. As explained above, consolidating five, ten, twenty, or even forty server images onto one physical server has tremendous benefit.
In particular, virtualization of the physical servers or hosts in the storage network environment allows for the possibility of running multiple operating systems and applications on the same physical server at the same time e.g. a single VMware ESX server may by “virtualized” into 1, 2, 4, 8, or more virtual servers, each running their own operating systems, and each able to support one or more applications. This virtualization of the servers may be enabled using software such as VMWare e.g. VMware ESX, which allows the virtualization of hardware resources on a computer—including the processor, memory, hard disk and network controller—to create a virtual server with an independent operating system.
In many storage network environments with VMs, however, there may be congestion and bottlenecks because the VMs or virtual servers share the same file system in the SAN. As the number of VMs and other components in the network increase, congestions and bottlenecks are more likely to occur. In particular, performance problems occur when two virtual servers write a large amount of data to a physical storage device and then try to access that data or other data from the physical storage device at a later time. If the storage I/O (input-output) load is high for this storage device, the VMs compete for storage device resources. As a result the VMs can be slowed down dramatically. This, in turn, results in decreased network efficiency, as it takes a long time for the storage devices to handle I/O requests from the VMs.
Currently, there are no adequate technological solutions to assist SAN administrators in managing storage I/O load for storage devices associated with virtual machines in a virtual server environment. Current storage I/O load balancing methods focus on balancing loads by moving VMs to different physical servers. In addition, current storage I/O solutions rely on host agents in hosts or physical servers that contain virtual servers (hosts that have been “virtualized”) within the SAN to collect a partial set of information from these virtual servers. A host agent is generally a piece of software executed on a computer processor. A host agent resides on the physical server or host and provides information about the host to devices external to the host. Using this partial set of information, SAN administrators then rely on manual methods e.g. manual spreadsheet based information entry, trial and error, etc., to manage change events in the virtualized storage network environment. However, host agents on a physical server are very difficult to manage and/or maintain, and are widely considered undesirable for large SANs in which scalability may be important. Thus, current methods do not resolve storage device bottlenecks because moving VMs to different physical servers may not necessarily change how these VMs access data on a physical storage device. Moreover, there are no solutions which consider the end-to-end service levels of applications, the end-to-end access paths for data flow, and the tier levels of resources and combination of storage and other network resources. Accordingly, there is a need present in the field for automatic storage resource load balancing among virtual machines (VMs) in virtual server environments.