1. Field of the Invention
This invention relates to low-overhead threads in a high-concurrency system, such as for a networked cache or file server.
2. Related Art
In many computing systems, it is desirable in certain circumstances to be able to process, relatively simultaneously (such as in parallel), a relatively large number of similar tasks. For example, the same or similar tasks could be performed by a server device (such as a file server) in response to requests by a number of client devices. One such circumstance is in a networked cache or file server, which maintains and processes a relatively large number of sequences of requests (sometimes called “connections”), so as to couple an information requester (such as a web client) to one or more information providers, which are also coupled to the same internetworking system. One known method in which an individual processor or a multiprocessor system is able to maintain a high degree of concurrency is for the system to process each connection using a separate processing thread. A “thread” is a locus of control within a process, indicating a spot within that process that the processor is then currently executing. In general, a thread has a relatively small amount of state information associated therewith, generally consisting only of a calling stack and a relatively small number of local variables.
High concurrency systems, such as networked caches and file servers used in an internetworking system, must generally maintain a large number of threads. Each information requester has its own separate connection for which the network cache or file server must maintain some amount of state information. Each such separate connection requires only a small amount of state information, such as approximately 100 to 200 bytes of information. Since there are in many cases a relatively large number of individual connections, it would be desirable to be able to maintain state information about each such connection using only a relatively minimal amount of memory and processor over-head, while simultaneously maintaining both relatively reliable programmability and relatively high processing speed.
One problem with known systems is that allocation of state information for individual threads does not generally scale well. One of the problems with relatively large numbers of individual threads is that of allocating memory space for a calling stack for each one of those threads. In a first set of known systems, stack space for individual threads is allocated statically; this has the drawback that relatively large numbers of threads require a relatively large amount of memory to maintain all such stack spaces. Although the amount of stack space statically allocated for each individual thread can be reduced significantly, this has the drawback that operations that can be performed by each individual thread are similarly significantly restricted. In a second set of known systems, stack space for individual threads is allocated dynamically; this has the drawback that the minimum size for dynamic allocation of memory is generally measured in kilobytes, resulting in substantial unnecessary memory overhead. Although virtual memory can be used to store and retrieve stack space for individual threads in smaller increments, this has the drawback that compression and decompression of stack space for individual threads imposes substantial unnecessary processor overhead. In a third set of known systems, such as those using the Java programming language, dynamic memory allocation is used to store and retrieve stack space for individual threads; this has the drawback that each procedure call within each thread imposes substantial unnecessary processor overhead.
An additional problem is introduced by the particular use made of multi-threading by the WAFL file system (as described in the Incorporated Disclosures). In the WAFL file system, the C language “setjmp” and “longjmp” routines are combined with message passing among threads so as to support high concurrency using threads. In particular, the requester of an initial file request to the WAFL file system packages the request in a message, which the WAFL file system processes using ordinary procedural program code, so long as data is available for processing the request and the thread need not have its execution suspended. If the thread is suspended for any reason (such as if a resource is not available,) the WAFL file system: (1) requests the needed resource, (2) 1.3 queues the message for signaling when the resource is available, and (3) calls the C routing “longjmp” to return to the origin of the routine for processing the message. Thus, the WAFL file system restarts processing the entire message from the very beginning until all needed resources are available and processing can complete without suspension. While this use of multithreading by the WAFL file system has the advantage that programmers do not need to encode program state when a routine is suspended, it has the disadvantage, when combined with multithreading, that all necessary data structures (to process any arbitrary message) must be collected before the entire message can be processed. In an internetworking environment, collecting all such structures can be difficult and subject to error.
Accordingly, it would be advantageous to provide a technique for creating and using relatively low-overhead threads in a high-concurrency system, such as for a networked cache or file server, that is not subject to drawbacks of the known art.