1. Technical Field
The present invention relates to computer networks and more particularly to systems and methods for resource allocation in cloud environments.
2. Description of the Related Art
Virtualization has rapidly gained popularity, which affects multiple levels of computing stacks. Since virtualization decouples resources from their users it provides greater flexibility in terms of resource allocation but also brings in new challenges for optimal design, provisioning and runtime management of systems. Cloud computing is a paradigm of computing that offers virtualized resources “as a service” over the Internet. Cloud Managers are responsible for lifecycle management of virtual resources, for efficient utilization of physical resources, and for exposing basic application programming interfaces (APIs) for operations to users. Software solutions can then be deployed on these virtual resources.
Virtualization, which decouples physical resources from their users, has emerged as one of the key drivers of data center innovation and optimization. Operating system virtualization was first proposed by IBM in the 1960's. Recently, with increased computing capacity of the low-end machines, similar capabilities are now available for many platforms. A virtualized server runs a thin software or firmware layer called a hypervisor which presents an abstraction of the underlying hardware to host multiple virtual machines (VMs). VMs execute either an unmodified (in case of full virtualization) or a slightly modified (in case of para-virtualization) version of the operating system. Virtualization increases server utilization and therefore increases data center efficiency by combining several workloads on a single server.
Referring to FIG. 1A, a typical solution built on virtualized servers is shown. A Solution Manager 12 deploys VMs 14 and manages them based on problem space specific knowledge and runtime information including current workload, calendar, wall clock time, and historical workload models. To perform these management functions, the manager 12 interacts with a hypervisor manager 16 on individual servers 18 or with a central virtualization infrastructure manager such a VMware's vCenter™. The virtualization manager 16 allows the solution manager 12 to monitor resource usage 20 on each of the servers 18 (and by each of the VMs 14) as well as to make configuration changes such as adding, removing, starting, stopping VMs 14. Manager 16 can also control the placement of VMs 14 on physical servers 18 and the relative shares of resources that are allocated to each of the VMs 14 during the periods of contention. The solution manager 12 manipulates these controls to optimize performance and resource utilization.
The latest development in the virtualization trend is known as cloud computing where the management of applications and groups of applications is separated from the management of the underlying physical resources (such as servers, networks, storage, etc.). The promise of cloud computing is to aggregate very large numbers of applications and services and achieve unprecedented levels of efficiency in server utilization and administration.
Referring to FIG. 1B, a basic cloud model is depicted. The cloud 32 provides virtual machines 34 and an application programming interface (API) 36. The API 36 supports the creation and destruction of VMs 34 plus a few basic controls of these VMs such as “Power On”, “Power Off”, “Reset”, etc. The cloud API 36 provides a very opaque interface to the virtualized environment. The cloud 32 is managed by a cloud manager 40, which assumes all responsibility for allocation of resources, placement of the VMs, and workload management of physical servers 38. For example, an enterprise can deploy a web server on a cloud without ever having to consider how many and what type hardware is used to support its computing needs. The enterprise, the cloud customer, is only concerned with specifying to the cloud provider how many virtual machines it needs and their resource requirements. The cloud provider is solely responsible for deploying the VM and managing the physical hardware, and the enterprise (or solution deployer) is responsible only for the solution software which runs within the VM 34. This separation of responsibility is further enforced by the common cloud APIs which hide from the enterprise most details of the underlying platform and hide from the cloud provider all (or most) details of the solutions running in the deployed VMs 34.
This model works well for solutions that do not require or benefit from explicit control over the resource allocation decisions. However, it presents a very challenging environment for optimizing the overall computing environment both from the perspective of the cloud provider and the solution manager. Solutions often use application specific intelligence and information to make optimization decisions for best resource utilization and service responsiveness. For example, a virtual desktop solution will make use of the calendar, usage histories, and user specific information such as “Joe is traveling to India this week” to optimize the user experience and resource usage. In the example of Joe traveling to Asia, the virtual desktop solution may decide to move Joe's virtual machine to a hosting center in Asia for the duration of his travel. It is tempting to merge the solution space optimization with the cloud as this can provide globally optimal workload management. However, incorporating solution space intelligence for every possible solution in the cloud management layer is a monumental task, and it is not obvious that all solution providers will be willing to divulge this knowledge to the cloud provider.
In essence, layering a virtualization aware solution such as in FIG. 1A on top of the cloud infrastructure such as in FIG. 1B requires compromises on the part of the solution, or the cloud, or both. Either the solution must give up its problem space specific workload management and rely on the cloud based physical resource management or the cloud must incorporate solution space knowledge and information in its management decisions. For the reasons stated above, both of these options are problematic.