1. Field of the Invention
This invention is related to the field of processors and, more particularly, to multiprocessor synchronization mechanisms in processors.
2. Description of the Related Art
Processors designed for use in multiprocessing systems typically support some sort of mechanism for synchronizing processes executing on the various processors. For example, certain sections of code may be designated as “critical sections”. Critical sections may update variables shared by the processes, read or write files, etc. Typically, the processes are synchronized such that at most one process at any given time is executing the critical section. As another example, the processes may share certain data areas in memory. Access to the shared data areas may be controlled in a similar fashion, synchronizing such that at most one process has access (or perhaps at most one process has write access, with other processes possibly having read-only access) to the shared data area at any given time.
Support for synchronization has been provided by processors in the form of an atomic read-modify-write of a memory location. The atomic read-modify-write can be used to implement various synchronization primitives such as test and set, exchange, fetch and add, compare and swap, etc. Synchronization may be managed by using atomic read-modify-writes to designated memory locations to communicate whether or not a critical section or shared data area is available, to indicate which process currently has access to the critical section or shared data area, etc.
Some processors may support atomic read-modify-writes using a lock mechanism. With a lock mechanism, when a processor accesses a memory location, other access to that memory location is prevented until the processor releases the lock. The atomicity of the read-modify-write operation to the memory location is guaranteed by preventing other processors from accessing that memory location. Lock mechanisms may be problematic in practice. For example, if the lock is implemented by locking a resource for accessing memory (e.g. a shared bus), deadlock may result (especially in coherent systems). Lock mechanisms for larger systems (e.g. multiple levels of interconnect between processors) may be problematic to implement.
Another approach for providing an atomic read-modify-write mechanism is the load-linked/store conditional mechanism. In this mechanism, two types of instructions are provided: the load-linked and the store conditional. Generally, a load-linked instruction and a store conditional instruction to the same address are used in pairs. The load-linked instructions operate similar to typical load instructions, but also cause the processor to monitor the target address of the load instruction (the address of the data accessed by the load). The store conditional instruction conditionally stores to the target address based on whether or not the target address is updated by another processor/device between the load-linked instruction and the store conditional instruction. Other conditions may cause the store not to occur as well. The store conditional may provide an indication of whether or not the store was performed, which may be tested by subsequent instructions to either branch back to the load-linked instruction to attempt the read-modify-write operation again (if the store was not successfully performed) or to continue processing (if the store was successfully performed). With the load-linked/store conditional mechanism, other processors may access the memory location for which the atomic read-modify-write is being attempted. If a modification occurs, the load-linked/store conditional sequence is repeated. When the store conditional completes successfully, an atomic read-modify-write of the location has been performed.
Since the processor resources for monitoring the target addresses of load-linked instructions is limited, speculative execution of load-linked instructions may be problematic. If a speculative load-linked instruction causes the processor to begin monitoring its target address and the speculative load-linked instruction is later canceled (e.g. due to branch misprediction or exception), a subsequent store conditional instruction may incorrectly complete successfully based on the target address of the speculative load-linked instruction. Similarly, a speculative load-linked instruction may cause the processor to cease monitoring the target address of a previous (non-speculative) load-linked instruction. The store conditional corresponding to the previous load-linked instruction may complete unsuccessfully in this case.