This invention relates generally to computer systems, and more particularly to a mechanism for obtaining a thread from, and returning a thread to, a thread pool without attaching and detaching.
Many of today""s high capacity computers, such as web servers, are required to process a large number of requests concurrently. To enable them to do so efficiently, many of these computer systems implement a technique known as multi-threading. In a multi-threaded system, there is allocated within a single process space a plurality of xe2x80x9cthreadsxe2x80x9d of execution. Each thread of execution has its own private stack, and each thread is used to execute a specific set of computer code at a time. During execution, each thread uses its stack to maintain state and other information specific to that thread of execution. This thread-specific information cannot be accessed or altered by other threads. As a result, each thread can execute code independent of other threads. It is this ability of threads to execute code independently that makes it possible for multiple threads to service multiple requests concurrently. While each thread can maintain its own set of private information, each thread can also share information with other threads within the same process space. This information sharing is carried out much more easily than in multi-processing systems (where inter-process communication is needed). These two properties, among others, make multi-threading an advantageous mechanism for concurrently servicing multiple requests.
In a typical multi-threaded system, multi-threading is implemented by first giving rise to a process space (e.g. by running an instance of a particular program, such as a web server program). Then, a plurality of threads (referred to herein as a thread pool) are allocated within that process space. In allocating the threads, each thread is given a unique thread ID, and each thread is allocated a stack having a particular size, where all of the threads within the thread pool are given the same stack size. Once the process space is created and the thread pool is allocated, the system is ready to service requests. When a request is received, the system determines whether the thread pool has an available thread. If so, then a thread is assigned to the request, and that thread is used to service the request. By servicing a request, it is meant that a set of code is executed to carry out the functions needed to satisfy the request. The execution of the set of code is carried out using the assigned thread and its associated stack. Multiple requests can be serviced concurrently; thus, if another request is received, then another thread is assigned to that other request, and that other thread is used to service the request. The two threads will execute independent of each other. As a result, the two requests can be serviced concurrently.
At some point, all of the execution that needs to be done to satisfy a request is completed. Once that point is reached, the thread assigned to the request is returned to the thread pool, and is thereafter free to be used to service another request. In the manner described, threads are assigned from the thread pool when needed, and threads are returned to the thread pool when servicing is completed.
The current methodology for implementing multi-threading described above is effective when all of the services demanded by the requests have substantially the same requirements. However, when the requirements of the services differ substantially, the current methodology can lead to inefficiencies. To illustrate this problem, suppose that a system is required to provide two different types of services in response to requests: (1) a lightweight service (such as an HTML static file retrieval service provided by a web server); and (2) a heavyweight service (such as a JAVA-type service). For the lightweight service, it would be optimal to have a large number of threads, with each thread having a small stack. The large number of threads allows a large number of requests to be serviced concurrently, while the small stacks conserve memory. In contrast, for the heavyweight service, it would be optimal to have a small number of threads, with each thread having a large stack. The small number of threads prevents the system from being overburdened by heavyweight requests, while the large stack size is needed for proper execution of the heavyweight service. Clearly, the requirements of these service types conflict.
To accommodate both in a single system, the current methodology has to reach a compromise. Typically, the compromise is a combination of the extremes, namely, a thread pool with a small number of threads, with each thread having a large stack size. On the positive side, this compromise ensures that even in the worst case scenario, where all of the threads are used for heavyweight services, the system will still function adequately. On the negative side, though, this compromise leads to inefficiencies. The small number of threads unnecessarily limits the number of lightweight services that can be provided at any one time, and the large stack size causes memory waste (the lightweight services do not need large stacks). As this discussion illustrates, the current methodology sacrifices efficiency in the general case to ensure proper operation in the worst case, which clearly is an undesirable result. To achieve greater system efficiency, an improved mechanism for implementing multi-threading is needed.
The present invention provides a more efficient mechanism for implementing multi-threading in a computer system. The present invention is based, at least partially, upon the observation that the inefficiencies of the current methodology stem from the fact that only one thread pool is used. With just one thread pool, it is necessary to make the compromise discussed above. However, if a plurality of thread pools is implemented, with each thread pool customized for one or more particular types of service, then no compromise is needed. When one type of service is needed, a thread from the customized pool associated with that type of service is used. When another type of service is needed, a thread from the customized pool associated with that other type of service is used. There is no need to use one type of thread (e.g. a heavyweight thread) when another type of thread (e.g. a lightweight thread) is needed. By implementing multiple thread pools, the present invention eliminates many if not all of the inefficiencies of the current methodology.
In light of this observation, there is provided an improved mechanism for servicing requests in a multi-threaded system. Initially, a plurality of thread pools is allocated within a process space, with each thread pool comprising one or more threads. Each thread pool has a set of characteristics associated therewith, and in one embodiment, the characteristics of each thread pool are customized for one or more particular types of service. The characteristics of a thread pool may include but are not limited to: (1) the maximum number of threads that can be allocated in that thread pool; (2) the stack size of each thread within that thread pool; and (3) optionally, whether each thread in that thread pool has additional private storage. These characteristics may be set such that they are optimal for particular types of services. For example, for a thread pool customized for lightweight services, the characteristics may be set such that the number of threads in the thread pool is large, and the stack size is small. In contrast, for a thread pool customized for heavyweight services, the characteristics may be set such that the number of threads is small, and the stack size is large. Each thread pool may have its characteristics customized for one or more types of service.
After the thread pools have been allocated, the system is ready to service requests. When a request is received, it is processed to determine with which thread pool the request is to be associated. In one embodiment, this processing is carried out by determining the type of service being requested by the request, and then determining which thread pool is associated with that type of service. In another embodiment, this processing is carried out by extracting a set of indication information (e.g. a universal resource identifier) from the request, and then determining which thread pool is associated with that set of indication information. Once the proper thread pool is determined, a thread from that thread pool is used to carry out servicing of the request. By servicing the request, it is meant that a set of code is executed to carry out the functions needed to satisfy the request. The execution of the set of code is carried out using the assigned thread and its associated stack. In this manner, the request is serviced. Because the request is serviced using a thread from the thread pool customized for the type of service being requested, the servicing of the request is optimized. This in turn optimizes system performance.
One of the thread pools that may be defined and allocated within a process space is a thread pool associated with JAVA type services. Because threads from this thread pool may be used to execute JAVA type applications, they are subject to more processing than most other threads. In particular, they are subject to regular attachment and detachment from a JAVA virtual machine (JVM). Typically, a thread is attached to the JVM prior to being used to execute any JAVA application, and that same thread is detached from the JVM once execution of the JAVA application 132 is completed. This process of constantly attaching and detaching threads from the JVM is inefficient. To increase system efficiency, the present invention further provides a xe2x80x9csticky attachxe2x80x9d mechanism. With this mechanism, it is possible to return a thread to the JAVA associated thread pool without detaching the thread from the JVM. Because the thread is returned to the thread pool without detaching, it can be retrieved from the thread pool and used again to execute a JAVA application without reattaching. This allows a thread to be attached to the JVM just once and used to execute an unlimited number of JAVA applications. By eliminating the need to attach and detach a thread each time a JAVA application is executed, the present invention significantly reduces the amount of overhead incurred. This in turn significantly increases the efficiency of the system.