1. Field of the Invention
The present invention relates generally to compute-resources (sets of servers that are logically and physically isolated from one another for the purpose of security and dedicated usage) and methods for allocating servers between compute-resources based on a new capacity threshold. More specifically, the present invention relates to a method for setting capacity thresholds, monitoring the computation load on each compute-resource, and reallocating servers when thresholds are exceeded.
2. Background Description
Compute-resources are commonly used for applications supporting large numbers of users, and those that are central processor unit (CPU) intensive and highly parallizable. Examples of such compute-resources include web-applications hosted by Internet service providers (ISPs), and many scientific applications in areas such as Computational Fluid Dynamics Often in such computing environments, load can vary greatly over time, and the peak to average load ratios are large (e.g., 10:1 or 20:1). When the load on a customer site drops below a threshold level, one of its servers is quiesced (removed from service), “scrubbed” of any residual customer data, and assigned to a “free-pool” of servers that are ready to be assigned. Later, when the load on another customer exceeds some trigger level, a server from the free-pool is primed with the necessary operating system (OS), applications, and data to acquire the personality of that customer application. Currently, there are few systems that support dynamic allocation of servers. Those that do exist depend on manually derived thresholds and measures of normal behavior to drive changes resource allocation. There are no automated effective and efficient methods for determining when a particular compute-resource is overloaded or under loaded that is relatively independent of application modifications.
Parallel computing and Server-Farm facilities would benefit greatly from an automatic method for monitoring available capacity on each compute-resource, and allocating servers accordingly. Such a system would provide more efficient use of servers, allowing groups of compute-resources to provide consistent performance with a reduced number of total servers. Such a system would be particularly applicable to large ISPs, which typically have many compute-resources that each experience significant changes in computing load.