A number of service providers rely on fleets of virtual server instances running applications on third-party provider hardware. An example of such virtual servers include the Elastic Compute Cloud (EC2) service provided by Amazon, Inc. The service provider may run a certain number of virtual server instances depending on the amount of traffic expected to be serviced by the applications. A load balancer, such as an elastic load balancer (ELB) may distribute application load amongst the virtual server instances.
The third-party provider may allow the service provider to dynamically change the number of virtual server instances being run at any given time. This capability is sometimes used to respond to spikes in traffic by quickly adding a new virtual server instance to handle the increased application load. For example, EC2 provides a service referred to as auto scaling groups (ASG), which automatically sets up a new EC2 instance when the traffic load exceeds a predefined threshold.
When a new virtual server instance is started, there is a delay before the instance can begin servicing requests. During this time, the instance may be starting a number of services to allow the instance to operate. Typical startup times may approach or exceed 50 minutes, and while in this startup phase the instance is not capable of responding to system health checks from the load balancer. Consequently, the load balancer is not able to assign a portion of the application load to the instance during startup. Because of these long startup times, the efficacy of dynamically expanding a virtual server pool in response to traffic spikes is reduced.
Moreover, during the instance's startup time, the load balancer may be precluded from adding further instances to the virtual server fleet. Thus, the amount of time required to provision sufficient virtual server instances to handle a traffic spike may be many multiples of the already-long instance startup time.