The present invention relates to computers with shared-memory architectures and in particular, to an architecture providing improved handling of conflicts that occur in the access of shared data.
Multi-threaded software provides multiple execution “threads” which act like independently executing programs. An advantage to such multi-threaded software is that each thread can be assigned to an independent processor, or to a single processor that provides multi-threaded execution, so that the threads may be executed in parallel for improved speed of execution. For example, a computer server for the Internet may use a multi-threaded server program where each separate client transaction runs as a separate thread.
Each of the threads may need to modify common data shared among the threads. For example, in the implementation of a transaction based airline reservation system, multiple threads handling reservations for different customers may read and write common data indicating the number of seats available. If the threads are not coordinated in their use of the common data, serious errors can occur. For example, a first thread may read a variable indicating an airline seat is available and then set that variable indicating that the seat has been reserved by the thread's client. If a second thread reads the same variable prior to its setting by the first thread, the second thread may, based on that read, erroneously set that variable again with the result that the seat is double booked.
To avoid these problems, it is common to use synchronizing instructions to delineate portions of a thread (often called critical sections) where simultaneous execution by more than one thread might be a problem. A common set of synchronizing instructions implement a lock, using a lock variable having one value indicating that it is “held” by a thread and another value indicating that it is available. A thread must acquire the lock before executing the critical section and does so by reading the lock variable and if the lock variable is not held by another thread, writing a value to the lock variable indicating that it is held. When the critical section is complete, the thread writes to the lock variable a value indicating that the lock is available again or “free”.
Typically, the instructions used to acquire the lock are “atomic instructions”, that is, instructions that cannot be interrupted once begun by any other thread or quasi-atomic instructions that can be interrupted by another thread, but that make such interruption evident to the interrupted thread so that the instructions can be repeated.
While the mechanism of locking a critical section for use by a single thread effectively solves conflict problems, that is, where two threads need to access a variable and at least one is writing, it can reduce the benefits of parallel execution of threads by forcibly serializing the threads as they wait for a lock. This serialization can be reduced by using a number of different locks associated, for example, with different small portions of shared-memory. In this way, the chance of different threads waiting for a lock on a given portion of shared-memory is reduced.
Generally, multiple locks increases the complexity of the programming process and thus creates a tradeoff between program performance and program development time. Even with multiple locks, serialization of the threads may occur.
U.S. patent application Ser. No. 10/037,041 entitled: “Concurrent Execution of Critical Sections by Eliding Ownership of Locks” describes a method of improving the execution of locked critical sections by multiple threads in which the threads do not acquire the lock but speculatively execute the critical section while omitting, or “eliding,” lock acquisition and release. During the speculative execution of the critical section, actual conflicts between threads in the acquisition of data of the critical section are monitored. If no actual conflicts occur, the speculative execution is committed, meaning that the data generated by the execution of the speculative section is written to shared memory.
This lock elision saves some time by avoiding the steps of acquiring and releasing the lock. More importantly, however, lock elision allows multiple threads to simultaneously execute the critical section, without serialization, so long as no actual conflicts in data acquisition occur.
At times, during speculative execution of a critical section under lock elision, there will be an actual conflict between two threads needing to access the same data. When such a conflict is detected, the speculative execution is “squashed” and the threads begin execution of the critical section from the beginning. The threads may retry speculative execution of the critical section, but ultimately the threads revert to actual acquisition of the lock in order to ensure that the critical section can be completed within a reasonable period of time. In these cases of actual conflict between threads, the problems inherent in lock-based synchronization return.