The present invention relates to a technique for sharing data in physically distributed memories in a computer network system, by a plurality of applications running on a plurality of computers interconnected by a network. More particularly, the invention relates to a distributed shared memory management system which is capable of efficiently using network resources and of improving the access performance to distributed shared memories.
A virtual entity executing an application on a computer is called a process. If a plurality of processes are generated and the resources of a single CPU (central processing unit) are time divisionally allocated to each process, it becomes possible to time divisionally execute processes greater in number than the number of CPUs. This process is called time division processing. A process of executing a plurality of processes at the same time by using a plurality of CPUs is called parallel processing.
In time division processing using a single CPU computer, a shared memory has been used for data exchange between processes. For example, a shared memory in a system V is known, as described in "UNIX System Call--Programming, by Marc J. Rochkind translated by Toshihiro FUKUZAKI, Ascii Publications, 1987" at pages 299 to 315.
In time division processing, processes are not physically executed at the same time. Therefore, an improved throughput in executing a process constituted by a plurality of processes cannot be expected. In a computer network system having a plurality of computers interconnected by a network, therefore, an application is parallel processed by using a plurality of CPUs in order to improve the throughput in executing the application.
In a computer network system, shared memories are physically distributed so that it is necessary to ensure coherence between data stored the memories of the computers, and to allow each application to have virtually single shared data. This technique uses what are called distributed shared memories.
In a computer utilizing a virtual memory configuration using a paging structure, it is possible to realize distributed shared memories by extending the paging structure. Specifically, if accessed data is not in the memory, a page fault is issued, and a page which includes the data is read from an auxiliary storage and loaded in the memory. In a distributed shared memory system, when a page fault is issued, data can be loaded from another computer to the subject computer via the network instead of loading a page from the auxiliary storage.
As examples of a conventional distributed shared memory technique, there are an Ivy system (the Yale University, U.S.A.) and a DASH system (the Stanford University, U.S.A.).
In the Ivy system, each page has its owner, and the owner computer manages the names of computers which store copies of the page. Only the owner computer is permitted to write data in the page. When a read fault occurs at a computer, the owner computer transmits a copy of the page. When a write fault occurs at a computer, this computer becomes a new owner computer, and copies of the page at other computers are invalidated.
The DASH system manages distributed shared memories in a similar manner to the Ivy system, excepting that 16-byte data is used as a data unit.
The details of these conventional techniques are given in "Distributed Operating System, Next to Come after UNIX, edited by Mamoru MAEKAWA, Mario TOKORO, and Kentaro SHIMIZU, Kyoritsu Publications, 1991" at pages 124 to 141.
In order to ensure coherence of the contents of distributed shared memories, data in the memories is exchanged between computers interconnected by a network. The data transfer size and the data transfer timing become important issues of a conventional distributed shared memory management system.
The Ivy system has the following considerations.
The first consideration is associated with the data transfer size. The size of data transferred between computers for data exchange is a page. Therefore, if computers share smaller data (e.g., 8 bytes) than the page size (e.g., 4 KB), unnecessary data is transferred between computers and the efficiency of usage of network resources is substantially lowered.
In addition, if two different data sets having a high access frequency by computers A and B are stored in the same page, the page is transferred between the computers A and B each time one of the computers A and B writes data in the page. This page transfer between computers over the network upon a memory access by each computer lowers the performance of memory access.
The second consideration is associated with the data transfer timing. A page fault is issued if the page of shared data is not in the subject computer during its data read/write operation, and the shared data is transferred from another computer.
Therefore, it the computers A and B consecutively write data in a shared data field, the page is transferred between the computers at the worst each time the computer writes data in the shared data field. The cost of data transfer between computers over a network is large as compared to the memory access speed in the subject computer, thereby lowering the performance of shared data access.
The DASH system aims at solving the first consideration of the Ivy system. That is to say, the shared data unit is made small in order to reduce unnecessary data transfer because of a large data unit. Although unnecessary data transfer can be reduced, if data access to a large memory area is performed, access faults occur frequently. With a frequent access fault, the data transfer time between computers over a network becomes important. The DASH system has proposed a specific dedicated network in order to shorten the transfer time of small size data. However, this dedicated network raises the cost because it is different from a general network architecture.
The second consideration of the Ivy system, that the performance is lowered when a plurality of computers consecutively access shared data, cannot be solved by the DASH system.
Because of the above-described considerations, if the conventional distributed shared memory management system is applied to typical area division type parallel processing, data transfer over the network frequently occurs, making it impossible to obtain the good speed-up performance of parallel processing. Furthermore, for example, if a data structure having a pointer variable element is shared by a plurality of computers, shared data is dispersed and stored in the memories. In such a case, data transfer over the network occurs frequently.