With the proliferation of the World Wide Web (WWW or simply the “Web”) and outsourcing of data services, computing service centers have increased in both size and complexity. For example, service center may include a collection of servers referred to as a server farm that run processes for a specific application, known as a cluster. Such centers provide a variety of services, such as Web content hosting, e-commerce, Web applications, and business applications. Managing such centers is challenging since a service provider must manage the quality of service provided to competing applications in the face of unpredictable load intensity and distribution among the various offered services and applications. Several management software packages which deal with these operational management issues have been introduced. These software systems provide functions including monitoring, demand estimation, load balancing, dynamic provisioning, service differentiation, optimized resource allocation, and dynamic application placement. The last function, namely dynamic application placement, is the subject of this invention.
Service requests are typically satisfied through the execution of one or more instances of each of a set of applications. Applications include access to static and dynamic Web content, enterprise applications, and access to database servers. Applications may be provided by HTTP (Hypertext Transfer Protocol) Web servers, servlets, Enterprise Java Beans (EJB), or database queries. When the number of service requests for a particular application increases, the management software in charge of placing applications deploys additional instances of the application in order to accommodate the increased load. It is often important to have an on-demand management environment allowing instances of applications to be dynamically deployed and removed. The problem is to dynamically change the number of application instances so as to satisfy the dynamic load while minimizing the overhead of starting and stopping application instances.
One problem associated with automatic instantiation of application processes in a server farm as the load for the applications fluctuates is that each server machine can run some limited number of application processes. Request messages for a particular application are split among all instances of that application. Therefore, when application instances use different servers, the size of a cluster directly impacts the amount of load that the cluster can sustain without performance degradation.
When the size of a cluster is insufficient, the application users experience performance degradation or failures, resulting in the violation of Service Level Agreements (SLA). Currently, to avoid SLA violation, application providers generally overprovision the number of application instances to handle peak load. This results in poor resource utilization during normal operation conditions. Dynamic allocation alleviates the problem of wasted capacity by automatically reallocating servers among applications based on their current load and SLA objectives.
Most of the placement algorithms available today are centralized. A centralized approach generally does not have the capability to react immediately to changes that occur between two placement operations. In a centralized solution, a single controller often needs to handle constraints from several nodes. Moreover, each application typically requires a certain time to start or stop. During this time, the reconfiguration process can take most of the CPU power on the local machine and therefore can partially disrupt its service capability. A centralized solution typically needs an enhancement to schedule the changes in such a way that they do not happen at the same time, in order to avoid a drastic reduction in the overall processing power of the system.