Conventionally, management of networked computer systems in organizations is divided among a number of groups such as networking, storage, systems, and possibly groups in charge of maintaining regulatory compliance. Enterprise applications require resources from each such functional area; a failure in any of these areas can have a significant impact on the business. The strategy of splitting the management responsibilities by functional areas has worked so far because the functional areas have traditionally been loosely coupled and the data center environments have been relatively static.
The trend towards convergence of computing, storage and networking in order to create a more dynamic and efficient infrastructure makes these functions dependent on each other. For example, server virtualization means that a small change made by the systems group may have a major effect on the network bandwidth. The increasing demand for bandwidth by networked storage accounts for a significant proportion of the overall network bandwidth, thereby making the network vulnerable to changes made by the storage group. In order to maintain the services in a converged environment, the complex relationships between various network elements need to be managed properly.
FIG. 1 shows a network communication system 100 that includes a multitude of switches configured to connect a multitude of hosts to each other and to the Internet. Four exemplary hosts 101, 102, 103, 104 (alternatively and collectively referred to as host 10), are shown as being in communication with the Internet via switches 221, 222, 223, 224, (alternatively and collectively referred to as switch 22), switches 241, 242 (alternatively and collectively referred to as switch 24), and switches 261, 262 (alternatively and collectively referred to as switch 26). Network communication system 100 is controlled, in part, by network equipment group 30, storage group 35, server group 40, and regulatory compliance group 45. Each such group monitors its own resources and uses its own management tools and thus has very limited visibility into the other components of the data center.
FIGS. 2A and 2B show the challenge faced in managing a networked system using a conventional technique. FIG. 2A shows a network communication system that includes a multitude of servers 1101, 1102, 1103 as well as a multitude of switches collectively identified using reference number 120. Each server 110i is shown as having one or more associated virtual machines (VM) 115i. For example, server 1101 is shown as having associated VMs 11511 and 11512; server 1102 is shown as having associated VMs 11521, 11522, and 11523; and server 1103 is shown as having associated VM 11531. Assume that a system manager decides to move virtual machine 11523 from server 1102 to server 1101—shown as VM 11513 in FIG. 2B following the move. The system management tools show that there is enough capacity on the destination server 1101 thus suggesting that the move would be safe. However, the move can cause the storage traffic, which had previously been confined to a single switch, to congest links across the data center causing system wide performance problems. The conventional siloed approach in which different teams manage the network, storage and servers has a number of shortcomings.