1. Technical Field of the Invention
The present invention pertains in general to shared memory network architectures, and more particularly, to an improved technique for shared memory between multiple nodes within a network of symmetrical processors.
2. Description of Related Art
For large scale parallel processing applications employing a shared memory programming model, maximum performance is typically obtained on a multiprocessor by implementing hardware cache-coherence. Large cache-coherent machines having more processors than can fit on a single bus have historically been expensive to implement due to the need for special purpose cache controllers, directories and network interfaces. As a result, many researchers have explored software cache-coherence techniques, often based on virtual memory, to support a shared memory programming model on a network of commodity machines. In the past, however, such Software Distributed Shared Memory (SDSM) systems have not provided sufficient performance to cost ratios to make them an attractive alternative to high end hardware.
Recent technological advances have produced inexpensive local area networks which allow processors in one node to modify the memory of other nodes safely from the user space with very low latency. Furthermore, small and medium scale symmetric multiprocessors are becoming commodity items and are receiving a growing acceptance for their use as database and web servers, multi-media work stations, etc. Given economies of scale, a networked system of small symmetric multiprocessors on a low latency network is becoming a highly attractive platform for large shared memory parallel programs. Symmetric multiprocessor nodes reduce the number of coherence operations which must be handled in software while low latency networks reduce the time which programs must wait for those operations to complete.
Although software shared memory has been an active area of research for many years it is only recently that protocols for such clustered systems have begun to develop. The challenge for such a system is to reconcile hardware implemented coherence of symmetric multiprocessor nodes with software implemented coherence among the nodes. Such reconciliation requires that each processor in a node in the networked system be synchronized each time one of the nodes exchanges coherence information with another node.
The present invention overcomes the foregoing and other problems with a method and apparatus for maintaining coherent data between nodes of a symmetric multiprocessor (SMP) cluster. Each node within the network contains local memory which includes a working copy storage area for storing copies of groups of data on which processing operations are directly carried out. A twin copy storage area stores twin copies of the groups of data within the working copy storage area. The twin copies are only updated at selected periods and comprise the state of the particular group of data prior to the most recent local modifications of the data. Finally, a home node storage area within the local memory stores home node copies of groups of data. There only exists a single home node copy of each group of data within the entire shared memory network. The home node copies are utilized for gathering changes to a group of data which may be made at multiple nodes within the network. It should be noted that the home nodes and working copy storage areas are preferably the same areas. Nodes will not create working copies of pages for which the node serves as the home node.
Processors associated with the node in the local memory monitor operations generated in response to a controlling program that affects the status of the various groups of data and their copies throughout the network. Upon detection of particular types of events that alter the status of a group of data, modifications to the working, twin and home node copies of a group of data may be implemented. For example, the initiation of a fetch operation of a home node copy of a group of data from a remote node location is one such operation. Upon detection of a fetch operation, a comparison is made between a fetched home node copy of the particular group of data and the twin copy of the group of data stored within the local node. The comparison detects modifications that have been updated within the home node copy that are not presently reflected by the twin copy. These changes are written into both the twin copy of the group of data and the working copy of the group of data at the local node such that the copies being processed by the local node contain all current information.
Another status change which may initiate operation of the updates of stored copies involves detection of a page fault operation for a working copy of a particular group of data. In this situation, the working copy of the group of data to which the write operation has been directed is compared with the twin copy of the group of data stored at the same node to detect any modifications made to the particular group of data since the last update operation of the twin copy. Differences detected by this comparison are noted and entered into the existing twin copy. The differences detected by the comparison are also written to the home node copy of the group of data to ensure that all copies are sufficiently updated.
Prior to any comparisons by the processor controlling the above-described operations, an initial determination may be made to find out whether the home node copy has been modified since the detected change in status. This is accomplished by comparing a time stamp of the most recent write operation of the twin copy of the group of data to a time stamp of the most recent fetch operation of the twin copy of the group of data. If the write operation occurred more recently then the fetch operation, modifications exist which have not been updated to the twin copy and updating is necessary. Each group of data within the system includes time stamps indicating the last occurrence of a write or fetch operation to enable these comparisons.
Accordingly, it is an object of the present invention to provide a software coherent shared memory system for a network of symmetric multiprocessors.
It is also an object of the present invention that such a software coherent shared memory system be highly asynchronous requiring no global directory locks or intra-node TLB shootdowns.
Yet another object of the present invention is that such a software coherent shared memory system will maintain twin copies of modified pages to reflect prior updates previous to any present modifications.
It is still further an object of the present invention to provide further advantages and features, which will become apparent to those skilled in the art from the disclosure, including the preferred embodiment, which shall be described below.
In yet another object a software, coherent shared memory system will minimize overhead incurred by data transfer, directory accesses, locking, and other protocol operations.