The present invention relates to the management of pooled resources to support multiple clients and applications. When there are multiple applications, it is beneficial to have those applications share the same resources. This is because different applications may have different loads at different times or may have different memory configurations, or may otherwise consume resources in a variable way that results in increased effective load, capacity if the resources are shared. This reduces the inefficiency of providing dedicated resources to each application in anticipation of each application's individual peak load, vs. the inefficiency of providing dedicated resources that will mostly be idle during non-peak periods (the so-called “assigned parking problem”). While assigning multiple applications to the same resources levels the peaks of normal variable loads, it does not handle the case where one application has an unexpected overload that cannot be handled by the resource pool.
This can occur, for example, when the consumers of the system self-provision resource intensive requests, whereby the complexity may be unpredictable because the service is self-provisioned and may allow service users to create art arbitrary sequence of compound processing steps. The number of requests may also vary significantly due to a variety of events, including daily, seasonal, or holidays, or factors driven more directly by the user of the service, such as sales, advertising, or promotions. In this case we desire some configuration that retains the benefits of load balancing shared resources, while at the same time limiting the exposure of one application to another application's overload.
The present invention also relates to load-balancing, whereby the requests to the service are managed through a single IP address but then distributed internally to one of a group of servers, or web farm. The web farm, may be physical or virtual. Particularly as resources become virtualized, it is possible to dynamically create large virtual web farms. As these web farms become larger it becomes both more difficult and more important to be able to maintain cost-efficient configurations while still maintaining adequate resource headroom for peak loads. The present invention addresses the shortcomings of existing server management configurations to both minimize the exposure of one application to another, through partial redundancy, and to limit the consumption of resources within an application. This allows the web service to be scaled to large numbers of client applications, with complex processing logic, and dynamic and unexpected loads.