1. Technical Field
The present invention is directed to an apparatus, method and computer program product for converting simple locks (e.g., disabled mutex spin locks) in a multiprocessor system. More specifically, the present invention is directed to an apparatus, method and computer program product for minimizing the negative effects that occur when such locks are highly contended among processors which may or may not have identical latencies to the memory that represents a given lock.
2. Description of Related Art
Multiprocessing systems provide the benefits of being able to process multiple instructions simultaneously. Such systems have increased the number of instructions per cycle that computer systems are able to execute and have added to the speed at which computerized functions are performed.
One type of multiprocessing system is the Symmetric Multiprocessing (SMP) system. The SMP architecture is an architecture in which multiple processors share the same memory. SMP systems provide scalability, e.g., as business increases, additional processors can be added to absorb the increased transaction volume. SMP systems range from two to as many as 32 or more processors.
SMP systems, however, are fallible in that if one processor fails, the entire SMP system, or node, goes down. In order to avoid such failings, clusters of two or more SMP systems can be used to provide high availability, or fault resilience, in case of failure. That is, if one SMP system fails, the others continue to operate.
In SMP systems, a single processor generally boots the system and loads the SMP operating system, which brings the other processors online. There is only one instance of the operating system and one instance of the application in memory. The operating system uses the processors as a pool of processing resources, all executing simultaneously, either processing data or in an idle loop waiting to do some useful processing work.
SMP speeds up whatever processes can be overlapped. For example, in a desktop computer, SMP speeds up the running of multiple applications simultaneously. If an application is multithreaded, i.e. the application is broken up into multiple processes which allow for concurrent operations within the application itself, then SMP improves the performance of that single application.
Another type of multiprocessing system is a Non-Uniform Memory Access (NUMA) system or cache coherent Non-Uniform Memory Access (ccNUMA) system. NUMA is a multiprocessing architecture in which memory is separated into close and distant banks. NUMA is similar to SMP, in which multiple processors share a single memory at the same speed, with the exception that in NUMA, memory on the same processor board as the processor (local memory) is accessed faster than memory on other processor boards (shared memory). As a result, NUMA architecture scales much better to higher numbers of processors than SMP.
With such multiprocessor systems, processors must contend for access to the shared memory resources. When a processor owns a lock used to serialize access to a shared memory resource, that processor gains control over that shared memory resource. The “lock” gives the processor exclusive access to the resource until the lock is released.
In the Advanced Interactive executive (AIX) environment, the mechanism for obtaining a lock on a system resource is referred to as the simple lock. A simple lock is a mutex mechanism in which the lock acquirer successfully changes the contents of a mutex to acquire ownership of the lock. Mutex (MUTually EXclusive) is a programming flag used to grab and release an object. When data is acquired that cannot be shared or processing is started that cannot be performed simultaneously elsewhere in the system, the mutex is set to “lock,” which blocks other attempts to use it. The mutex is set to “unlock” when the data is no longer needed or the routine is finished.
AIX conventionally uses the thread id of the acquiring thread or interrupt handler, along with other bit mapped information that will fit in the mutex, such that the ownership of the lock can be observed and debugged. AIX uses the thread id so that ownership can be observed, but the owner knows it owns the lock because the owner succeeded in changing the value of the mutex, irrespective of the contents. A thread or interrupt handler acquires a simple lock by calling a function which manages the atomic update of the mutex and the entire contents of the lock word, i.e. the mutex.
In the multiprocessor environment, a plurality of processors may attempt to access the same shared memory resource by attempting to acquire a mutex lock. Only one is provided exclusive access to the shared memory resource at a time. If the resource is “in use” or locked by another processor, any other requesting processor “spins” on the lock for that resource when it attempts to acquire the lock. While a processor is spinning on a lock for a resource, it is not performing any other work.
As the number of processors in a system increases, the potential for contention for a lock that protects a shared memory resource increases dramatically. In NUMA systems, processors that access a memory location (or mutex) will have different latencies, with the effect being that acquisition of the lock by a processor with a longer latency will be more difficult than by a processor with a shorter latency. This makes normal mutex lock acquisition unfair. A given processor that has a longer latency to a mutex than other processors has a disadvantage with respect to all those other processors. When latencies can be different, a given processor or a number of processors may become unable to acquire a lock due to starvation from processors with shorter latencies.
Thus, it would be beneficial to have an apparatus, method and computer program product for handling simple locks in a multiprocessor system that is fair to all of the processors of the multiprocessor system.