Distributed computer systems typically comprise multiple computers connected to each other by a communications network. In some distributed computer systems, the networked computers can concurrently access shared data. Such systems are sometimes known as parallel computers. If a large number of computers are networked, the distributed system is considered to be “massively” parallel. As an advantage, parallel computers can solve complex computational problems in a reasonable amount of time.
In such systems, the memories of the computers are collectively known as a distributed shared memory. It is a problem to ensure that the data stored in the distributed shared memory are accessed in a coherent manner. Coherency, in part, means that only one computer can modify any part of the data at any one time; otherwise, the state of the data would be nondeterministic.
Some distributed computer systems maintain data coherency using specialized control hardware. The control hardware may require modifications to the components of the system such as the processors, their caches, memories, buses, and the network. In many cases, the individual computes may need to be identical or similar in design, which means they are homogeneous.
Consequently, hardware controlled shared memories are generally costly to implement. In addition, such systems may be difficult to scale. Scaling means that the same design can be used to conveniently build smaller or larger systems.
More recently, shared memory distributed systems have been configured using conventional workstations or PCs connected by a conventional network as a heterogeneous distributed system. In such systems, data access and coherency control are typically provided by software-implemented message passing protocols. The protocols define how fixed size data blocks and coherency control information is communicated over the network. Procedures that activate the protocols can be called by “miss check code.” The miss check code is added to the programs by an automated process.
States of the shared data can be maintained in state tables stored in memories of each processor or workstation. Prior to executing an access instruction, e.g., a load or a store instruction, the state table is examined by the miss check code to determine if the access is valid. If the access is valid, then the access instruction can execute, otherwise the protocols define the actions to be taken before the access instruction is executed. The actions can be performed by protocol functions called by the miss handling code.
The calls to the miss handling code can be inserted into the programs before every access instruction by an automated process known as instrumentation. Instrumentation can be performed on executable images of the programs.
U.S. Pat. No. 5,761,729, entitled Validation Checking of Shared Memory Accesses, issued Jun. 2, 1998. discloses a method for providing valid memory between processors or input/output interfaces connected to processors, all of which access the shared memory within the distributed computer environment. The method provides instrumentation to initialize the bytes allocated for the shared data structure to a predetermined flag value. The flag value indicates that the data are in an invalid state.
Unfortunately, the prior art system is directed towards a generic solution for covering shared memory accesses with a distributed computer environment. It is not able to correct or provide management for specific processors that follow specific read/write ordering functions such as the Alpha AXP microprocessor, manufactured by Digital Equipment Corporation, Maynard, Mass.
The Alpha AXP processor can be used in a single processor environment or in multiple processor environments such as a distributed computer environment. Additionally, the Alpha AXP processor is considered to be in a multi-processor environment when it includes a single processor with a direct memory access (DMA) input/output (I/O). In a multi-processor data stream, the Alpha AXP communicates shared data by writing the shared data on one processor or DMA I/O device, executes a memory barrier (MB) or a write MB (WMB), then writes a flag signaling the other processor that the shared data is ready. Each receiving processor must read the new flag, execute an MB, then read or update the shared data. In the special case in which data is communicated through just one location in memory, memory barriers are not necessary.
In a significant special case occurrence, when a write is done to some physical page frame, an MB is executed and a previously invalid page table entry (PTE) is changed to be a valid mapping of the physical page frame that was just written. In this case, all processors that access virtual memory by using the newly valid PTE must guarantee to deliver the newly written data after the translation buffer (TB) miss, for both I-stream and D-stream accesses, where I represents instructions and D represents data.
The overall operation of the Alpha AXP processor is described in ALPHA AXP ARCHITECTURE REFERENCE MANUAL, Second Edition, Sites and Witek, Published by Digital Press, 1995, incorporated by reference for all purposes.
Unfortunately, this multi-processor synchronization is very expensive in terms of performance. Without the synchronization, data may be corrupted when a page fault occurs on one CPU and the recently faulted data is immediately referenced from another CPU. The execution of the MB instruction in the TB-Miss flow after fetching the PTE forces a memory coherency point. Any outstanding cache coherency operations are completed prior to using the PTE to fetch data. Unfortunately, there is a performance penalty that results in up to 20% degradation in performance on the Alpha AXP processor. Unfortunately, not in all cases is the memory ordering necessary. Accordingly, what is needed is a way of limiting ordering to only those threads or processes that are actually sharing the PTE effected by the initial MB.