This invention pertains generally to symmetric multi-processing, and more specifically to a method and system for serving in a thread-safe manner a request queue in a multi-processing environment.
Symmetric Multi-Processing (“SMP”) has become de facto standard for multi-processor hardware architectures. There are several highly popular Operating Systems (“OS”) that incorporate support for SMP. A multitasking operating system divides the work that needs to be done among “processes,” giving each process memory, system resources, and at least one “thread” of execution, which is an executable unit within a process. While a “process” logically represents a job the operating system must do, a “thread” represents one of possibly many subtasks needed to accomplish the job. For example, if a user starts a database application program, the operating system will represent this invocation of the database as a single process. Now suppose the user requests the database application to generate and print a report. Rather than wait for the report to be generated, which is conceivably a lengthy operation, the user can enter another database query while this operation is in progress. The operating system represents each request—the report and the new database query—as separate threads within the database process.
The use of threads to represent concurrent user requests extends to other areas of the operating system as well. For example, in a server application that accepts requests from a number of different clients, there will typically be many incoming requests to the file server, such as read and write requests. At any given time during the operation of a computer system, there may be a large number of incoming requests to an application program, a server, or other processor of requests. An application program may process these requests by representing each incoming request as a thread of execution. The threads are provided by the operating system and can be scheduled for execution independently on the processor, which allows multiple operations to proceed concurrently.
Multitasking can cause contention for system resources that are shared by different programs and threads. Shared system resources comprise sets of data or physical devices. In order to resolve the contention for shared resources, the computer operating system must provide a mechanism for scheduling the execution of threads in an efficient and equitable manner, referred to as thread scheduling. In general, thread scheduling requires the operating system to keep track of the execution activity of the pool of threads that it provides to application programs for processing incoming user requests. The operating system also determines the order in which the threads are to execute, typically by assigning a priority level to each thread. The objective of the operating system is to schedule the threads in such a way that the processor is always as busy as possible and always executing the most appropriate thread. The efficiency in which threads are scheduled for execution on a processor distinguishes one operating system from another.
In multitasking operating systems (“OS”), thread scheduling is more complex than simply selecting the order in which threads are to run. Periodically, a thread may stop executing while, for example, a slow I/O device completes a data transfer or while another thread is using a resource it needs. Because it would be inefficient to have the processor remain idle while the thread is waiting, a multitasking operating system will switch the processor's execution from one thread to another in order to take advantage of processor cycles that otherwise would be wasted. This procedure is referred to as “context switching.” When the I/O device completes its data transfer or when the resource that the thread needs becomes available, the OS will eventually perform another context switch back to the original thread. Because of the extraordinary speed of the processor, both of the threads appear to the user to execute at the same time.
Certain OSs, such as the “WINDOWS NT” OS, schedule threads on a processor by “preemptive multitasking,” i.e., the OS does not wait for a thread to voluntarily yield the processor to other threads. Instead, each thread is assigned a priority that can change depending on requests by the thread itself or because of interactions with peripherals or with the user. Thus, the highest priority thread that is ready to run will execute processor instructions first. The operating system may interrupt, or preempt, a thread when a higher-priority thread becomes ready to run, or after the thread has run for a preset amount of time. Preemption thus prevents one thread from monopolizing the processor and allows other threads their fair share of execution time. Two threads that have the same priority will share the processor, and the OS will perform context switches between the two threads in order to allow both of them access to the processor.
Because of the multiprocessing capabilities of current OSs, there is an elevated need for SMP-aware software. One such application for SMP-aware software is the control and service of a print queue. The basic abstraction of an SMP system is a Multi-Threaded Environment (“MTE”). The MTE abstraction is provided by the OS as described mentioned above without regard to the actual number of processors running. Therefore, when software is written to make use of a MTE, one can achieve a performance improvement whether or not the SMP hardware platform contains multiple processors.
The single basic MTE entity is thread. Threads are independent units or paths of execution that operate in a Virtual Memory Address Space (“VMAS”). The contents of the VMAS are specific to processes. Different processes generally have different VMAS (with the exception of shared memory between processes where memory is mapped to the same virtual address in more than one process) while different threads share the VMAS of the process.
In order for MTE software to run successfully, it must synchronize the access of individual threads to shared data. Generally, this synchronization is accomplished through Synchronization Objects (SO) maintained by the MTE. These SO guarantee that only a predetermined number of threads can access a shared resource, while all other will get blocked. The number of threads that run simultaneously depends on the number of processors on the SMP platform. Blocking is a mechanism for temporarily suspending a thread from execution. During the scheduling operation, individual threads in potentially different processes have an opportunity to run either for a period of time or until they are blocked. If a thread is blocked, it will not be scheduled to run. Once the thread returns to an unblocked state, it will be scheduled to run. This type of synchronization is known as blocking synchronization and it is achieved through software implementation.
An alternative form of synchronization known as non-blocking synchronization is controlled by what are known as atomic operations. These are operations that complete before any other processor or hardware resource is given a chance to interact with the system. Typically, these operations are implemented as individual processor instructions. Whenever an individual processor executes an atomic instruction, all other processors are blocked from accessing memory or other hardware resources that may preclude the execution of the atomic operation in progress. In this manner, synchronization is achieved through hardware implementation. During blocking synchronization, the thread state is changed from “running” to “blocked” and vice versa. During non-blocking synchronization, however, no state change is required. Consequently, non-blocking synchronization is generally orders of magnitude faster than blocking synchronization.
Client-server architecture is frequently used in today's computer systems. Often, client-server architecture can be represented by a one-to-many relationship between servers and clients in a network, where one server is expected to respond to requests issued from the many clients. Most intelligent Digital Imaging Devices (“DID”) contain an internal device controller which is a fully functional computer with the necessary hardware and software that ensure proper operation of the DID. Generally, the DID in this architecture acts as a server and the user desktops act as clients.
In order to process requests from clients efficiently, servers Request Queues (“RQ”). RQs are data structures that hold requests sent from a client to the server. Such requests are suitably simple requests, such as requests to retrieve server status, or complex requests, such requests to print a plurality of documents. In order to ensure maximum server availability fulfill such requests, servers enqueue requests on the RQ and process them as server resources become available. This allows a server to acknowledge requests more quickly. Generally, a server maintains two pools of threads—one for enqueueing incoming requests and one for processing the requests from the queue. In between these pools is the RQ, which serves as an intermediate storage for the incoming requests, while a thread of the dequeueing pool becomes available. The threads from the dequeueing pool usually process the requests as well, although that is not necessarily the case. Again, when this MTE is deployed on SMP hardware, some of the threads will actually run in parallel and hence improve performance.
When the number of threads in both pools increases, the amount of contention for the RQ also increases. When the goal is to provide high availability server and to lower request-processing times, the exact method of queue implementation and thread pool maintenance make a significant difference in performance. Furthermore, when dealing with multi-processing techniques, it is important that data be “thread-safe,” or protected from simultaneous modification by different threads. One such method of preventing unwanted interactions is to use a semaphore technique, as is known in the art.