1. Field of the Invention
This invention relates to the field of computer systems, and more particularly to verification of compliance with memory consistency models for multiprocessor systems.
2. Description of the Related Art
Shared memory multiprocessor computer system architectures have become a common solution for complex computing needs, such as are often encountered in computer network servers and telecommunications applications. A typical shared memory multiprocessor computing system includes two or more processors that access shared memory. The same physical address on different processors typically refers to the same location in the shared memory. In shared memory architectures, a memory consistency model typically specifies the semantics of memory operations to coordinate accesses by multiple processors to the shared memory. A memory model effectively establishes a contract between the programmer and the hardware. Thus, both programs and hardware in a shared memory multiprocessor system must be correct with respect to the memory model definition for proper operation. Memory models can have a significant impact on ease of programming and optimizations performable by the hardware or the compiler.
One example of a memory consistency model is the Total Store Order (“TSO”) memory model developed by Sun Microsystems, Inc. The TSO memory model specification defines the semantics of load, store and atomic memory operations (such as swap operations) in uniprocessor or multiprocessor systems from the point of view of program results. TSO defines two types of orders over the set of memory operations: a per processor program order denoting the sequence in which the processor logically executes instructions, and a global memory order conforming to the order in which operations are performed at the memory.
Memory operations are ordered by six TSO axioms: the Order, Atomicity, Termination, LoadOp, StoreStore and Value axioms. The Order axiom requires that there is a total order over all stores. The Atomicity axiom requires that there be no intervening stores between a load component and a store component of an atomic memory operation such as a swap. The Termination axiom requires that all stores and swaps eventually terminate. That is, if one processor of a multiprocessor does a store to a particular memory location and another processor repeatedly does loads to read the particular memory location, there will eventually be a load that reads the value stored by the first processor. The LoadOp axiom requires that if an operation follows a load in per processor program order, then the operation must also follow the load in global memory order. The StoreStore axiom requires that if two stores appear in a particular order in per processor program order, then they must also appear in the same order in global memory order. Informally, the LoadOp and StoreStore axioms together imply that under TSO, the only kind of reordering allowed between operations on the same processor is for loads to overtake stores, i.e., a load which succeeds a store in program order may precede it in global order. The Value axiom requires that the value returned by a load from a particular memory location is the value written to that memory location by the last store in global memory order, among the set of stores preceding the load in either global memory order or program order. The Value axiom allows a load to read the value written by an earlier store on the same processor, before that store has completed in global order. This permits processor implementations with store buffers, for example, to locally bypass data from a store to a load, before the store is globally visible. In a multiprocessor supporting the TSO memory consistency model, a violation of a TSO axiom by a sequence of memory operations may indicate a design problem or bug.
One difficulty with advanced shared memory multiprocessor architectures is that design problems or bugs are difficult to find, isolate and correct. The memory subsystem is among the most complex parts of modern multiprocessor architectures, especially of architectures employing chip multiprocessing (CMP) or simultaneous multithreading (SMT), and therefore among the most bug-prone. Undetected bugs result in improper operations that often lead to system failures and that delay new design releases or, worse, require post-release patches. It is often difficult to determine the validity of program execution results in the presence of race conditions. Since the results of the program may be timing-dependent, multiple legal outcomes may exist, and a simple architectural model of the multiprocessor may not be sufficient to verify that the results comply with the memory consistency model. Existing techniques to verify program execution results may sometimes require analysis steps with relatively high levels of computational complexity. As a result, cost and time constraints associated with typical processor design cycles may tend to limit the use of the existing techniques to relatively small programs and/or relatively small multiprocessors.