The servicing of electronic requests can require varying amounts of resources. Request service, for instance, can range in scale from tiny, stateless computations to long-running massively parallel applications. The servicing of requests often requires only a limited amount of computing resources, often much less than the computer systems used to service the requests have available. As a result, computing resources often go underutilized and, generally, conventional techniques for processing requests have numerous inefficiencies. Virtualization, in many regards, has improved the way computing resources are utilized by, for instance, allowing a single physical computer system to implement multiple simultaneously operating virtual computer systems, thereby providing resizable capacity that makes it easy for a developer to elastically scale upwards.
Conventional virtualization techniques, however, are subject to fundamental limitations on the ability of a developer to scale compute downwards due to the resources required to service a request and the amortization of costs for spinning up and tearing down a virtual computer system (instance). Practical implementations of service virtualization generally rely on an expectation that the workload will have a tenancy of minutes, hours, or even longer. For example, with many applications, a virtual computer system may be used relatively infrequently. To have the virtual computer system able to service requests, however, the virtual computer system must be maintained in an operational state, which requires computing resources for the computer system's operating system and other resources (e.g., network resources). When such computer systems are underutilized, at least some of resources allocated to those computer systems are generally unavailable for other uses.