1. Field of the Invention
The present invention relates to memory systems for computers, and more particularly to improved memory models for use in a multiprocessor data processing system.
2. Art Background
A multiprocessor system includes a number of processors connected to a memory system. Processors typically interact with the memory system using Loads, Stores, and other synchronization operations such as atomic Load-Store. When running a program, processors may execute other operations, such as adding the contents of one register to another register, or performing a subroutine call; however, these operations do not affect the behavior of the memory system as observed by the processors. This behavior of the memory system as observed by the processors is referred to as the "memory model".
A "specification" of the memory model is a description of how the memory system ought to behave. The main purpose of such a specification is to allow hardware designers and programmers to work independently, while still ensuring that any program will work as intended on any implementation of a computer system that conforms to the specification. Ideally, a specification should be "formal", such that conformance to the specification can be verified at some level. In practice, however, in many instances the specifications are "informal" or even nonexistent, in which case a particular hardware implementation becomes the specification of the memory model by default.
Memory is modeled as an N port device, where N is the number of processors. The memory model applies to single as well as multiple processor systems. (See FIG. 1). A processor communicates with the memory system by issuing memory operations. A processor issues the operations through its respective port. As illustrated in FIG. 1, processor P.sub.1 communicates with the memory system 10 through its respective Port 11. Similarly, Processor P.sub.N communicates to Memory 10 through Port .sub.N.
A memory model may range anywhere from Strong (or Sequential) Consistency to different types of weak consistency. Strong Consistency is the memory model that most programmers are familiar with. In a Strong Consistency model, the memory operations of all processors appear to execute in a single global order that is compatible with the issuing order of the individual processors. While this model is intuitively appealing and generally understood, it is also the one that provides the worst performance, particularly when the computer system includes numerous processors. For further information on the Strong Consistency model, see, L. Lamport, "How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs", IEEE Trans, on Computers, September 1979.
Weaker memory models were developed to allow more efficient implementations of scalable multiprocessors. Unfortunately, weak memory models are more difficult to understand than Strong Consistency models, and they constrain the way parallel software programs can be written. Implementing weaker memory models also requires considerably more care on the part of hardware designers, and using weaker memory models requires a conscious effort on the part of programmers to avoid incorporating the model provided by Strong Consistency.
Thus, the choice of a memory model involves making a trade-off between what is convenient for programming versus what provides the potential for high performance in hardware. For more information on this trade-off, see, J. Hennessy et al., "Hardware/Software Tradeoffs for Increased Performance", Proc. Symp. Architectural Support for Programming Languages and Operating Systems, (1982) pp. 2-11.
As will be described, the present invention describes a formal specification for two improved memory models, Total Store Ordering (TSO) and Partial Store Ordering (PSO), both of which are "strong" enough to be convenient to program and are also "weak" enough to provide high performance.