1. Field of the Invention
This invention relates generally to computing systems, and, more particularly, to a method and apparatus for ensuring fairness in acquisition of a limited resource, such as a spinlock in a multiprocessor environment.
2. Description of the Related Art
In modern computer systems in general, and particularly for computers in the server class, which often have multiple processors, it is common practice to have an operating system that is multi-threaded, and often multi-user as well. With multiple processes running concurrently, contention for system resources occurs, with two or more processes or threads attempting to control the same system resource.
Turning to FIG. 1, a block diagram of a prior art computer system 100 is illustrated. The computer system 100 includes a plurality of system building blocks 101, shown as building blocks 101A, 101B, and 101C. Each system building block 101, similar to system building block 101A, as shown, couples to a network 180 through a port 125, and includes a plurality of processors 105, shown as processors 105A, 105B, 105C, and 105D, a memory 115A, input/output resources (I/O) 120A, and the port 125A for coupling the plurality of system building blocks 101.
Note that the memory 115 may include resources such as random access memory (RAM), read only memory (ROM), flash memory, or other types of memory, otherwise referred to as primary storage in computer systems. The I/O resources 120 may include resources such as disk storage, disk drives, or storage arrays, such as are known in the art including magnetic or optical storage, otherwise referred to as secondary storage. Other I/O resources 120 may include connections to input devices, including keyboards, pointing devices, and other interfaces or devices for providing data to the computer system 100, as well as output devices, including monitors, printers, or other interfaces or devices known for retrieving data from the computer system 100. It is further noted that the system building blocks 101B and 101C are not required by every embodiment of the present invention to be present, identical, or similar to system building block 101A.
As an introduction to contention in the computer system 100, reference is made to the prior art flowchart shown in FIG. 2, illustrating a method 200 of using interrupt levels and interrupting processes in the computer system 100. During the operation of the computer system 100, considering the independent operation of only a single processor 105A, the computer system 100 operates at a given interrupt priority level (IPL) (block 205). The following discussion applies to a single processor computer system or a multiple processor computer system with only one processor operating. For the sake of illustration, consider thirty-two (32) different IPLs, designated as IPL0-IPL31, with IPL31 being the highest and IPL0 the lowest, such as may be found in the VMS operating system running on the ALPHA architecture. High and low, refer to which IPL takes precedence over another IPL, with a lower numbered IPL being suspended or interrupted by a higher numbered IPL. As examples of some processes and their typically associated IPLs, most user processes are at IPL0, context coherent processes operate at IPL8 or higher, and below IPL3, a process may be dynamically reassigned freely from one processor, such as processor 105A, to another processor, such as processor 105C, by the computer system 100.
While handling an interrupt request, the computer system 100 will determine periodically if a request for a higher numbered IPL has occurred (decision block 210). This periodic determination is usually performed at a time increment that is known as a “polling interval.” Some computer systems rely on a hardware control line assertion. If a request for a higher numbered IPL has occurred, then the current operations of the computer system 100 are interrupted, and the computer system 100 begins operating at the newer, higher numbered IPL (block 225). The request for the higher numbered IPL may occur as a control line changes state. In other computer systems, such as those running a real time operating system, the computer system becomes physically interrupted. The method then shows that the computer system 100 returns to operating at the given IPL (block 205), such as after handling the request for the higher numbered IPL.
If a request for a higher numbered IPL has not occurred, then the computer system 100 determines if the operations at the current IPL have completed (decision block 215). If the operations at the current IPL have not completed, then the method shows the computer system 100 returning to operating at the given IPL (block 205). If the operations at the current IPL have completed, then the computer system 100 drops to a lower IPL (block 220). The method then shows the computer system 100 returning to operating at the given (the new, lower numbered) IPL (block 205).
When the computer system 100 drops from a higher numbered IPL to a lower numbered IPL, any previously interrupted process at the lower numbered IPL is restarted and completed, unless the previously interrupted process is again interrupted by a higher IPL process.
While the use of IPLs is sufficient for the computer system 100 that includes only a single processor 105A, the cooperation of the second, third, or nth processor 105 in the computer system 100 requires that an additional locking mechanism be used so that processors 105A and 105B operating at the same IPL do not both attempt to use the same resource at the same time.
One mechanism commonly used in multiprocessor computer systems such as the computer systems 100 is a “spinlock.” Described simply, the spinlock is a synchronization element associated with a given system resource that may be requested by more than one processor 105 concurrently. In one form, the spinlock includes two quadwords (8 bytes each) stored in a register, memory location, or a cache. A given spinlock is typically associated with some particular resource within the computer system 100. The spinlock is said to be obtained (or acquired) by the processor 105 that successfully wins a “joust.” Vying for the spinlock is often referred to as “jousting.” Jousting often involves writing a particular bit in the first quadword of the spinlock. The other quadword is an address associated with the associated resource, as is known in the art, and will be ignored for the purposes of this disclosure. The spinlock also allows each processor 105A-105N to operate independently with respect to its own IPL, as no other processor 105A-105N has need to know the IPL of any other processor 105.
Note that spinlocks may be static with a known priority level, meaning that they must be obtained in a certain order, or dynamic. Dynamic spinlocks have no inherent relationship between the spinlocks. Also note that additional data items may additionally be associated with a given spinlock.
Turning to FIG. 3 a prior art flowchart of a method 250 of operating the computer system 100 using a spinlock to access a particular shared resource is briefly illustrated. One or more processors 105A-105N attempt to grab the spinlock for the particular shared resource (block 252). Each processor 105N evaluates its own success in obtaining the spinlock (decision block 254). If the spinlock is not obtained, the processor 105N enters spinwait (block 256). Spinwait may include waiting a predetermined period of time, referred to herein as a “timed wait interval,” with other pending operations by the processor 105N in spinwait being either suspended or processing while in spinwait for the spinlock for the particular shared resource that the processor 105N is trying to obtain. Upon leaving spinwait, the processor 105N again attempts to grab the spinlock (block 252).
If the spinlock is obtained, the processor 105N continues the operations that led to obtaining the spinlock (block 258). If the operations are not finished, then the processor 105N continues (block 258). When the operations are finished (decision block 260), the method 250 ends.
Note that a given processor 105N may obtain the spinlock for the particular shared resource multiple times in succession, leading to a nested spinlock state. The given processor 105N must then relinquish the spinlock for the particular shared resource a number of times equal to the depth of the recursion before another processor, such as processor 105A, may grab the spinlock for the particular shared resource.
One problem that arises in the computer system 100 is that processor 105A, or a subset of the processors 105A-105N, may have an unequal chance at grabbing the spinlock for the particular shared resource. During contention for the spinlock for the particular shared resource, processor 105A may grab the spinlock at almost every attempt with the other processors 105B-105N being essentially locked out. The advantage to the processor 105A may be an intentional design or it may be due to a slight flaw in manufacturing process of the computer system 100.
One result of the problem described is that while the processor 105A may operate at or near its maximum throughput or efficiency, other processors 105B-105N in the computer system 100 will not operate at or near their maximums. Although the overall processing power of the computer system 100 may be close to a theoretical maximum, it is likely that under these circumstances, operations of the processors 105B-105N other than the processor 105A will be at less than optimum. The computing work of processor 105N still needs to be completed in a timely manner, even if the processor 105A is operating at its maximum.
Various ways of prioritizing which processor 105A-105N may obtain the spinlock have been devised in the prior art. One prior art method is to simply order the processors 105A-105N and go down the list in order, with the next processor 105A-105N being the processor 105 that acquires the spinlock next. Other methods have also been devised, but each prior art method has its own drawbacks. What is needed is a flexible method for prioritizing which processor 105A-105N obtains the spinlock so that the computer system 100 throughput is not lowered too much even though all computing work in the computer system 100 is allowed to move forward towards completion. Even better would be a method that works in computer systems 100 having low contention, medium contention, and high contention spinlocks.