This invention relates generally to computer systems, and more particularly to a mechanism for evaluating requests prior to disposition in a multi-threaded environment.
Many of today's high capacity computers, such as web servers, are required to process a large number of requests concurrently. To enable them to do so efficiently, many of these computer systems implement a technique known as multi-threading. In a multi-threaded system, there is allocated within a single process space a plurality of “threads” of execution. Each thread of execution has its own private stack, and each thread is used to execute a specific set of computer code at a time. During execution, each thread uses its stack to maintain state and other information specific to that thread of execution. This thread-specific information cannot be accessed or altered by other threads. As a result, each thread can execute code independent of other threads. It is this ability of threads to execute code independently that makes it possible for multiple threads to service multiple requests concurrently. While each thread can maintain its own set of private information, each thread can also share information with other threads within the same process space. This information sharing is carried out much more easily than in multi-processing systems (where inter-process communication is needed). These two properties, among others, make multi-threading an advantageous mechanism for concurrently servicing multiple requests.
In a typical multi-threaded system, multi-threading is implemented by first giving rise to a process space (e.g. by running an instance of a particular program, such as a web server program). Then, a plurality of threads (referred to herein as a thread pool) are allocated within that process space. In allocating the threads, each thread is given a unique thread ID, and each thread is allocated a stack having a particular size, where all of the threads within the thread pool are given the same stack size. Once the process space is created and the thread pool is allocated, the system is ready to service requests. When a request is received, the system determines whether the thread pool has an available thread. If so, then a thread is assigned to the request, and that thread is used to service the request. By servicing a request, it is meant that a set of code is executed to carry out the functions needed to satisfy the request. The execution of the set of code is carried out using the assigned thread and its associated stack. Multiple requests can be serviced concurrently; thus, if another request is received, then another thread is assigned to that other request, and that other thread is used to service the request. The two threads will execute independent of each other. As a result, the two requests can be serviced concurrently.
At some point, all of the execution that needs to be done to satisfy a request is completed. Once that point is reached, the thread assigned to the request is returned to the thread pool, and is thereafter free to be used to service another request. In the manner described, threads are assigned from the thread pool when needed, and threads are returned to the thread pool when servicing is completed.
The current methodology for implementing multi-threading described above is effective when all of the services demanded by the requests have substantially the same requirements. However, when the requirements of the services differ substantially, the current methodology can lead to inefficiencies. To illustrate this problem, suppose that a system is required to provide two different types of services in response to requests: (1) a lightweight service (such as an HTML static file retrieval service provided by a web server); and (2) a heavyweight service (such as a JAVA-type service). For the lightweight service, it would be optimal to have a large number of threads, with each thread having a small stack. The large number of threads allows a large number of requests to be serviced concurrently, while the small stacks conserve memory. In contrast, for the heavyweight service, it would be optimal to have a small number of threads, with each thread having a large stack. The small number of threads prevents the system from being overburdened by heavyweight requests, while the large stack size is needed for proper execution of the heavyweight service. Clearly, the requirements of these service types conflict.
To accommodate both in a single system, the current methodology has to reach a compromise. Typically, the compromise is a combination of the extremes, namely, a thread pool with a small number of threads, with each thread having a large stack size. On the positive side, this compromise ensures that even in the worst case scenario, where all of the threads are used for heavyweight services, the system will still function adequately. On the negative side, though, this compromise leads to inefficiencies. The small number of threads unnecessarily limits the number of lightweight services that can be provided at any one time, and the large stack size causes memory waste (the lightweight services do not need large stacks). As this discussion illustrates, the current methodology sacrifices efficiency in the general case to ensure proper operation in the worst case, which clearly is an undesirable result. To achieve greater system efficiency, an improved mechanism for implementing multi-threading is needed.