In recent years, the development, deployment, and life-cycle management of Internet services have been greatly simplified by the development of middle-ware platforms which provide several functions including monitoring, load balancing, access control, and inter-operability across business units. Some of the middle-ware platforms also provide clustering functionality that enables the creation of collections of servers that provide a common set of replicated services. In order to utilize available resources in the most efficient manner it is best to allocate them to services dynamically according to their oscillating requirements.
To address this challenge application-server middle-ware may be extended with the ability to dynamically allocate resources to services through dynamic application placement. This invention concerns an algorithmic technique that allows the efficient computation of an application placement according to several optimality criteria that meets server capacity constraints. The proposed technique incorporates restrictions with regard to the mapping of applications to servers, allows multiple copies of the same application to be started on a server, and produces placement that allows the applications' load to be evenly distributed across servers.
The proposed approach differs from prior art in the following areas. Prior techniques assume that an application that has been placed on a server can utilize the entire server capacity in the presence of sufficient load. See M. Steinder, A. Tantawi, B. K. Martin, M. Spreitzer, G. Cuomo, A. Black-Ziegelbei, “On Demand Application Resource Allocation Through Dynamic Reconfiguration of Application Cluster Size and Placement”, a patent application filed Oct. 6, 2004, in the USPTO assigned Ser. No. 10/978,944, which is incorporated herein by reference in entirety for all purposes. In practice, applications have internal bottlenecks that prevent them from utilizing the entire server power.
For such applications it may be beneficial to start multiple instances in a single node. The proposed technique allows the amount of CPU power that is allocated to a single instance to be limited. Prior techniques focus on maximizing the amount of resource demand satisfied by the resulting placement. While this objective remains the primary concern, focusing only on this requirement results in application placements that allow only unbalanced load distribution, in which some servers are 100% utilized while others have little or no load on them. Unbalanced load distribution affects application performance and is not reasonable as resource management feature. The proposed technique maximizes the amount of satisfied demand while also producing an application placement that allows a balanced load distribution. Placements that allow load to be balanced across servers provide better performance, are more resilient to server failures, and better tolerate workload spikes.
An example method provides placement of applications on a cluster of servers to facilitate load balancing. The method includes the steps of: obtaining a current placement of applications, computing a suggested new placement of applications, and modifying the suggested placement by computing and replacing a set of (application, server of origin, destination server) triples such that moving an application in each triple from its server of origin to the destination server will maximize the utility of the final placement.