In a typical Web service or remote procedure call (RPC) service, a pool of worker threads services a queue of incoming requests, performing the work that is required for the request and responding to the requester when the work is complete. When services of this kind are deployed in a service architecture, it is typical for such service threads to be blocked while waiting on calls to downstream services, requests for database connections, access to input/output mechanisms (e.g., I/O channels), and/or on other constrained resources on which they depend (e.g., computational resources or memory). When these constrained resources become unavailable or slow to respond, service requests that require them tend to cause all of the available service threads in the system to become blocked while waiting for these resources.
Typical approaches to solving this problem include the use of asynchronous input/output mechanisms, timeouts for in-flight requests, and timeouts around calls that may become slow or block other services. While these approaches are typically effective, they are difficult to add to existing services, add complexity to the service implementation, increase the difficulty of predicting system performance, and significantly reduce throughput. Another common approach is the use of admission control for requests, which prevents too many requests from being accepted, e.g., by throttling.
While the technology described herein is susceptible to various modifications and alternative forms, specific embodiments thereof are shown by way of example in the drawings and will herein be described in detail. It should be understood, however, that the drawings and detailed description thereto are not intended to limit the disclosure to the particular form disclosed, but on the contrary, the intention is to cover all modifications, equivalents and alternatives falling within the spirit and scope of the present disclosure as defined by the appended claims.