1. Field of the Invention
This invention relates to a method and apparatus for performing an operation on a semaphore and, more particularly, to a method and apparatus for performing an operation on a binary semaphore.
2. Description of the Related Art
In the world of computing, there is a well-known serialization mechanism, known as a semaphore, used to serialize access to a shared resource. A semaphore is an integer that can vary between zero and some predetermined positive number n, where n is the number of requesters allowed simultaneous use of the resource. In the case of a binary semaphore, n=1 and only one requester can use the resource at a time. Initially the semaphore value is set at n. To obtain access to a resource, a requester (e.g., a unit of work such as a process or a thread) tests the current value of the semaphore. If the current value is greater than zero, the requester decrements the current value by one and proceeds to use the resource. If the current value is zero, on the other hand, the requester is suspended (i.e., goes to sleep) and is put in a queue as a waiter for the semaphore. To release a resource, the requester increments the semaphore value by one. In addition, if there are any waiters in the queue for the semaphore, one of the waiters is posted, whereupon it again attempts to obtain the resource by testing the semaphore as it did initially. Semaphores were first proposed by E. W. Dijkstra in 1965 and are described in such publications as A. S. Tanenbaum, Modern Operating Systems (1982), pages 41-43; J. P. Hayes, Computer Architecture and Organization (1988), page 540; and W. R. Stevens, UNIX Network Programming (1990), pages 137-152, incorporated herein by reference.
There are two main types of semaphores: counting semaphores (n&gt;1) and binary semaphores (n=1). This specification will describe the problems inherent in many semaphore implementations and then describe a high-performance semaphore implementation of the present invention. Although the invention is not limited to a binary semaphore, the description will be primarily in terms of such a semaphore.
FIG. 1 shows a C language example of a semaphore set. A semaphore set is initially created when the C program calls the semget function. (The description in terms of the C source language is for convenience of explanation. The code that actually executes the steps is of course the object code produced by compiling the source code.) The caller defines how many semaphores 120 are in a semaphore set 115. At the time a semaphore set is created, a semid (semaphore id) is returned to the caller. This semid is then used on each subsequent call to the semop function. The operating system keeps track of the semaphore sets in a table 105, where each entry 110 and associated semaphore set 115 are kept track of.
FIG. 2 provides a general description of how semaphore processing works conventionally. The figure shows process 1 (205) and process 2 (210) contending for the same semaphore. The general behavior of binary semaphores is that the semaphore value is initialized to 1 (available). When an application wants to obtain the semaphore, it does a semop(-1) to decrement the value of the semaphore. The main rule of semaphores is that the semaphore value is not allowed to go negative. So the first caller of semop to decrement the semaphore value to 0 owns the semaphore. Any subsequent request to decrement the semaphore will result in the caller being suspended.
So process 1205 calls semop(-1) (step 220). This gets into the kernel 215 for semaphore processing. The kernel looks at the state of the semaphore and sees that it is currently at a value of 1. The kernel decrements the semaphore value, changing it from 1 to 0, thus granting access to process 1 (step 222). The kernel then returns to process 1 (step 224). Process 1 now is able to access whatever critical resource is protected by the semaphore (step 226). This resource may be memory, a file on DASD or just a critical section of code that cannot execute concurrently in multiple processes.
While process 1 owns the semaphore, process 2 (210) also requests a semop(-1) against the same semaphore (step 228). The kernel cannot perform the -1, since the value is currently 0 and cannot go negative. So the kernel suspends the caller (process 2) (step 230).
Eventually, process 1 finishes accessing the critical resource and performs a semop(+1) (step 232). The kernel modifies the semaphore value from 0 to 1, which causes the suspended process 2 to be resumed (step 234). The kernel running on behalf of process 2 then modifies the semaphore value from 1 to 0 (step 236), thus granting ownership of the semaphore to process 2. Process 2 then can access the critical resource (step 240). After posting the first waiter, control is returned to the caller (step 238). When process 2 is done accessing the resource, it issues a semop(+1) to free up the semaphore (step 242). The kernel modifies the semaphore value from 0 to 1, making it available. Since there are no waiters for the semaphore, control is returned to the caller (step 244).
This was a logical explanation of how semaphore processing operates. When running in a symmetric multiprocessor (SMP) environment, processes running on different central processing units (CPUs) can be concurrently attempting to obtain or free the semaphore. FIG. 3 shows the complications that can occur with a conventional semaphore implementation.
In order for the kernel semaphore logic to atomically update the semaphore value and maintain the wait queue, it is necessary for the kernel to obtain some form of lock or mutex. There are many forms this can take, and it is not unique to any particular platform. When the kernel owns this lock, other callers of semop will be suspended waiting for this lock. This lock is very similar to the concept of semaphores.
In FIG. 3, the first block of pseudocode describes the processing in semop to perform a semop(-1) or an obtain of the binary semaphore (step 310). The first step is to obtain a lock to serialize the semaphore structures (step 312). If this lock is not available, the system suspends the calling thread and places it on a lock wait queue 360 by creating a queue element 362 to represent the process. When this lock becomes available, this process is resumed and can continue past step 312 knowing that the semaphore structures will not change while it is processing. If the semaphore is available (step 314), the kernel grants the semaphore to the caller by modifying the semaphore value (step 316). Then the kernel releases the internal lock (step 318), which may cause other processes that are on the wait queue 360 to be resumed. Control then is returned to the caller (step 319).
If the semaphore value is not available (step 320), then the kernel releases the internal lock (step 322) and suspends the caller by adding a semaphore waiter element 372 to the semaphore wait queue 370 (step 324). When this process is resumed (step 326), it starts over at step 312 (step 328).
When a process calls semop(+1) to release a binary semaphore (step 340), the kernel first obtains the internal lock to serialize access to the semaphore structures (step 342). If this lock is not available, the system suspends the calling thread and places it on the lock wait queue 360 by creating a queue element 362 to represent the process. When this lock becomes available, this process is resumed and can continue past step 342 knowing that the semaphore structures will not change while it is processing. The kernel then modifies the semaphore value by incrementing it by 1 (step 344). If there are any waiters 372 on the semaphore queue 370, then the first element 372 is removed and the process that it represents is resumed (step 346). Then the kernel releases the internal lock (step 348), which may cause other processes that are on the wait queue 360 to be resumed. Control is returned to the caller (step 350).
Now consider a pathological example. Say 500 processes concurrently request a semaphore. One process successfully obtains the internal lock and 499 processes are suspended and added to the wait queue 360 for the internal lock. Then the first process releases the internal lock and one by one, each of the processes obtains the lock, only to find the semaphore unavailable (at step 320) and be suspended a second time, this time on the semaphore wait queue 370. Now in the middle of this processing of the 499 processes, the first process is ready to release the semaphore. Since releasing the semaphore also requires the same internal lock (at step 342), this process is put at the end of the wait queue 360 for the internal lock. This guarantees that each process wakes up simply to get moved from the internal lock wait queue 360 to the semaphore wait queue 370.
To put this in the form of an analogy, imagine a line of people waiting to buy a ticket to a show. As each person gets to the head of the line, they are told there that are no tickets available, but that they can wait on line until someone cancels. Along comes a person who wants to cancel their ticket, but must wait on the same line for all the people who want to buy one. Now repeat this over and over and you have a lot of senseless waiting. In a computer system, this shows up as bad performance and limited capacity.