1. Field of the Invention
This invention relates to a method and apparatus for manipulating a queue and, more particularly, to a method and apparatus for serializing a message queue in a multiprocessing environment without the use of a conventional lock or latch to control access to the message queue data structures.
2. Description of the Related Art
Digital computers generally have one or more user applications executing under the general supervisory control of an operating system (OS) kernel. Each user application, which may be running concurrently with other user applications, constitutes a separate process having its own address space and its share of system resources. Interprocess communication (IPC) mechanisms are a set of programming mechanisms that allow different processes to intercommunicate and coordinate their use of shared resources. One such mechanism is the semaphore, described in the commonly owned, copending application of applicant D. F. Ault et al., Ser. No. 09/040,722, filed Mar. 18, 1998, entitled “Method and Apparatus for Performing a Semaphore Operation” and incorporated herein by reference. Another such mechanism is the message queue, described in such standard works as W. R. Stevens, UNIX Network Programming (1990), pages 126–137, incorporated herein by reference.
Although the present invention is not limited to UNIX implementations, the UNIX standards define functions for creating a message queue (msgget), sending a message (msgsnd) and receiving a message (msgrcv). The following is a brief summary of these message queue functions:    msgget( ): Requests that a message queue be defined. There are permission controls which allow the application to permit or prevent users from accessing the message queue.    msgsnd( ): Send a message to a queue. The message consists of a TYPE and a message. The TYPE field is an integer which can also be thought of as a priority. TYPE=1 would be the highest priority. All sent messages are added to the end of the message queue, so that the queue is ordered oldest to newest.    msgrcv( ): Receive a message. The caller specifies a TYPE as follows:            TYPE=0: Receive the oldest or first element on the message queue.        TYPE=n: Receive the first element on the message queue which has TYPE=n        TYPE=−n: Receive a message which has TYPE≦n, which has the lowest TYPE value. In other words, receive the highest priority message on the message queue with TYPE≦n.        
Multiple user processes can be concurrently sending messages to the queue as well as receiving messages. The operating system kernel is responsible for controlling the access to the message queue and maintaining the integrity of the data. Most operating systems provide this control by defining a lock or latch which is obtained for all send and receive operations. U.S. Pat. No. 5,313,638 to Ogle et al., entitled “Method Using Semaphores for Synchronizing Communication Between Programs or Processes Resident in a Computer System”, is one such implementation where the lock used is a semaphore.
The following flow shows how typical message queue operations are performed:    MSGSND: Send a message    1. Obtain a lock to serialize the message queue. If the lock is not available, suspend the caller until the lock is available.    2. Check if another task is waiting for a message in the msgrcv function. If there is a waiter, assign the message to that waiter and wake up the waiting task.    3. If there are no waiters for the message, then add the message to the end of the message queue.    4. Release the lock. This will wake up the next task waiting for the lock.MSGRCV: Receive a message    1. Obtain a lock to serialize the message queue. If the lock is not available, suspend the caller until the lock is available.    2. Search the queue to locate a message which will satisfy the request. If a message is found, remove the message from the message queue and return the message to the caller. Release the lock. This will wake up the next task waiting for the lock.    3. If no message is found, create a queue element which identifies this task as waiting for a message. Release the lock and suspend the task. This process will be woken up by the processing defined in step 2 under msgsnd.    4. When the task is woken up, repeat at step 1.In a system with hundreds or thousands of processes or threads requesting msgsnd and msgrcv against the same message queue, the lock requests can cause serious contention in the operating system and result in long response times or reduced transaction rates.
Another problem with the current art relates to error recovery. From a general recovery perspective, the current art tends to use one of two models. In one model, the system first sets a footprint indicating that a recoverable action is to be taken (step 1), then performs the recoverable action (step 2). In the other model, the system first performs the recoverable action (step 1), then sets a footprint indicating that a recoverable action has been taken (step 2).
To make this example more relevant to this discussion, assume the recoverable action is to add or remove an element from a message queue. This can involve updating multiple pointers in queue elements and queue anchor pointers. If an error (e.g. program check) occurs, recovery routines are passed control. The logic in the recovery routine for both of the above models is that if the footprint is set for a recoverable action then, then the routine performs a recovery action against that resource.
This leads to the dilemma of what to do when the error occurs in between steps 1 and 2 (in either model) or when the error occurs in the middle of the recoverable action. In particular, when modifying linked chains, an incorrect recovery action can result in a damaged chain which will prevent any future processing.