In a typical computer network, a server computer 10 services requests from and provides data to client computers 12, 14, 16, 18 that are connected to the server computer 10 over a network (e.g., the Internet) 20. The server computer 10 may be used to store data, programs, etc. for use by the client computers 12, 14, 16, 18. Those skilled in the art will recognize that the server computers 10 may also be used to manage and control the client computers 12, 14, 16, 18.
In one example, an internet-based business may operate a server (or servers) to handle customer requests initiated at computer terminals located potentially thousands of miles away from the business's server(s). Each time a customer accesses the website of the business, an independent set of software instructions, i.e., a “thread,” is executed by a processor of the business's server(s) to provide requested data to the customer.
In order to better meet increased networking demands, the “server” side of the client-server model shown in FIG. 1 may be implemented using any one of a variety of designs. For example, in FIG. 2, a “server” side of a computer network is implemented using server 30, 32 that each having a single processor 34, 36, respectively. Each single processor 34, 36 is capable of executing one thread at a time. Thus, the “server” side in FIG. 2 is capable of executing two threads at a time for the client computers 38, 40, 42, 44 connected to the servers 30, 32 over network 46. If a third thread is initiated while each of the processors 34, 36 is executing a thread, one of the threads being executed by the processors 34, 36 may be blocked in order to allow the third thread to be executed (dependent on, for example, priority of the third thread). Alternatively, the third thread may be forced to wait until one of the processors 34, 36 completes executing its respective thread.
In another type of design, for example, as shown in FIG. 3, a “server” side of a computer network is implemented using a server 50 that has a multithreaded processor 52. The multithreaded processor 52 maintains the execution context of a plurality of threads. Thus, if the multithreaded processor 52 supports the execution of x threads at a time, the multithreaded processor 52 may rapidly switch between x threads for the client computers 54, 56, 58, 60 connected to the server 50 over network 62. When a thread being executed by the multithreaded processor 52 stalls due to, for example, waiting for data from memory, the multithreaded processor 52 may rapidly switch to another thread and execute instructions from that thread.
In another type of design, for example, as shown in FIG. 4, a “server” side of a computer network is implemented using a multiprocessor server 70. The multiprocessor server 70 has a plurality of processors 72, 74, 76, 78 that are each capable of executing one thread at a time. Thus, in FIG. 4, the multiprocessor server 70 is capable of executing four threads in parallel for the client computers 80, 82, 84, 86 connected to the multiprocessor server 70 over network 88. Those skilled in the art will recognize that a symmetric multiprocessing (SMP) system is a type of multiprocessing system in which multiple threads may be executed in parallel. Although typical SMP processors only process one thread at a time, the greater number of processors in the SMP system relative to that of a non-multiprocessing system increases the number of threads that are executable in a given period of time.
In another type of design, for example, as shown in FIG. 5, a “server” side of a computer network is implemented using a multiprocessor server 90 that has a plurality of multithreaded processors 92, 94, 96, 98. Thus, if each of the four multithreaded processors 92, 94, 96, 98 is capable of executing x threads at a time, the multiprocessor server 90 is capable of executing 4x threads at a given time for the client computers 100, 102, 104, 106 connected to the multiprocessor server 90 over network 108.
The execution of a software thread in any one of the types of processors described above with reference to FIGS. 2-5 occurs in a part of the processor known as the “core” (referred to and known in the art as “processing core”). The processing core is formed of a hardware execution pipeline and functional units (e.g., arithmetic units and load/store units) that actually perform the execution of a software thread.
In the case of a multithreaded processor as described above with reference to FIGS. 3 and 5, when there are more threads ready to run than there are execution contexts available, a scheduler, typically part of the operating system, selectively assigns some number of threads to a processing core of the multithreaded processor. Such a multithreaded processing core interleaves execution of instructions from multiple threads, potentially switching between contexts (i.e., switching between threads) on each cycle. A thread may become blocked when the thread encounters a long-latency operation, such as, for example, servicing a cache memory miss. When one or more threads are unavailable, the multithreaded processing core continues to switch among the remaining available threads. Those skilled in the art will recognize that for multithreaded workloads, such multithreading improves processor utilization and hides the latency of long operations.