Many computing environments today utilize multiple processors. For example, a symmetric multiprocessing (“SMP”) environment is one type of computing environment commonly used today. An SMP environment includes two or more processors that are connected to a shared main memory. All of the processors are generally identical, insofar as the processors all utilize common instruction sets and communication protocols, have similar hardware architectures, and are generally provided with similar memory hierarchies.
These processors often contain a small amount of dedicated memory, known as a cache. Caches are used to increase the speed of operation. In a processor having a cache, as information is called from main memory and used, it is also stored, along with its address, in a small portion of especially fast memory, usually in static random access memory (SRAM). As each new read or write command is issued, the system looks to the fast SRAM (cache) to see if the information exists. A comparison of the desired address and the addresses in the cache memory is made. If an address in the cache memory matches the address sought, then there is a hit (i.e., the information is available in the cache). The information is then accessed in the cache so that access to main memory is not required. Thereby, the command is processed much more rapidly. If the information is not available in the cache, the new data is copied from the main memory and stored in the cache for future use.
Critical sections of cache are sometimes locked to protect the data. Locking of critical sections is a pervasive and performance critical operation in Operating Systems, middleware and end user applications. Locks usually are a software convention that gives one entity such as a processor, process, program, program thread, or the like access to a data structure or to a code sequence. Once the locking entity owns or has the lock no other entity has access to the lock. The standard mechanisms for locking involve using shared variables, access to which is protected by architecturally enabled atomic instructions. The shared variables need to be accessible efficiently from all processors in an SMP. As stated above, each processor typically has its own cache. Multiple processors may potentially try to access the same lock at the same time resulting in potential “hot spotting” of locks.
These locks are called global locks since each processor has an equal chance of acquiring the lock as compared to a local lock, which a single processor is usually the only one that access the lock. Such hot spotting causes the cache line containing the lock word to inefficiently and frequently migrate between processor caches. This migration impairs the efficiency of lock operations. Once a process (running on a processor) acquires a lock, ownership must be made visible to other processors on the SMP; this is typically accomplished via a synchronization instruction. The broadcast or the sync instruction are typically quite expensive and do not scale efficiently with the size of the SMP.
Therefore a need exists to overcome the problems with the prior art as discussed above.