Computer systems typically comprise a combination of hardware, such as semiconductors, transistors, chips, and circuit boards, and computer programs. As increasing numbers of smaller and faster transistors can be integrated on a single chip, new processors are designed to use these transistors effectively to increase performance. Currently, many computer designers opt to use the increasing transistor budget to build ever bigger and more complex uni-processors. Alternatively, multiple processor cores can be placed on a single chip, which is often called chip multiprocessor (CMP) design.
Placing multiple smaller processor cores on a single chip is attractive because a single, simple processor core is less complex to design and verify. This results in a less costly and complex verification process, as a once verified module, the processor, is repeated multiple times on a chip. A way to take advantage of the multi-processors is to partition sequential computer programs into threads and execute them concurrently and speculatively, on the multiple processors. Concurrent thread execution means that different threads of a given program can execute on any available processor, and different threads of the given program can execute on different processors at the same time. Thus, a speculative multi-threaded processor consists logically of replicated processor cores that cooperatively perform the parallel execution of a sequential program.
Speculative thread execution means that the threads are allowed to optimistically assume that shared data structures can be written without conflict with the concurrent reads and writes of other speculative threads. The speculative writes to the shared data structures that are requested by a threads are kept pending (meaning that the updated data written by one thread is not visible to or accessible by other threads) until the system confirms that no conflicts with accesses to the shared data structures by other threads have occurred. If conflicts between the storage accesses of other threads are detected, the system discards the pending speculative writes, rolls back the thread, and re-executes the thread. If no conflicts are detected, the system commits the pending speculative writes to memory where the shared data structures become visible to and accessible by other threads.
Conflicts between threads are reported by the hardware to the operating system in the form of interrupts. These interrupts are asynchronous and can arrive at the operating system out-of-order from the order in which the hardware sent them, due to factors such as on-chip wire delay and asymmetric distance from the processor cores to the memory.