A server is designed to provide services in response to requests from clients. A common approach to enable servers to process several requests concurrently is the use of thread pools that include a quantity of threads created to perform tasks. The quantity of threads in a thread pool is a resource, and typically, there are many more tasks than threads. If the quantity of threads in a thread pool is too low, there is less concurrency, potentially reducing overall request processing throughput. On the other hand, if the quantity of threads in a thread pool is too large, more time is wasted with context changes among threads and there is a greater chance of lock contention (e.g., threads with exclusive access to the same resources). A further result is a decrease of server throughput.
When an application utilizes a single thread pool to serve several associated sub-applications, all the sub-applications may experience a processing delay when just a few of the sub-applications are causing the delay. Further, when threads associated with a sub-application attempt to interact with a sub-application experiencing delay, these threads may be blocked until the delayed sub-application resolves. Performance of the application declines causing the user experience to degrade.