Data processing needs have increased from the development of the multiprocessor system to increasingly sophisticated multi-computer arrangements. In the multiprocessor realm, the processors interact in performing programs and sharing resources such as input and output devices. Generally, the multiple processors in such a system share data which is resident in a common memory. In addition, each processor may have sole access to an additional local cache memory for non-shared data. With regard to that memory which is shared, the various processors must compete for access to the data, which results in inefficiencies of both time and resources.
Multi-computer systems frequently operate with multiple nodes each having a local copy of the shared memory data. In order to keep the shared memories consistent, each node has direct access to the other node's stored data, or "physical memory", so that one computer can simultaneously write updates to all locations of the shared data. A problem with the shared physical memory concept is that there may be a conflict between the nodes if more than one attempts to write to the same location in memory. Moreover, the process of a receiving node must remain inactive while a writing node's update is being written into the receiving nodes' memory space. In addition, in a system with more than two nodes, wherein only one multiple location shared memory update can be written at a time, it is not only possible, but quite likely, that at any given time corresponding memory locations at different nodes will contain different memory values, some having been updated and others having not been updated. The sharing of memories has generally required that the shared data be held at a fixed location or locations in each node to facilitate direct access to the memory location(s) by another node in the network. This sharing of fixed location physical memory, however, prevents an individual node from most efficiently allocating its own memory space. Extending the memory sharing concept beyond two machines is rather difficult given the foregoing obstacles.
Interprocessor communications are limited by the bandwidth capabilities of the network as well as the coordination problems discussed above. Many of the newly proposed high speed computer network architectures are moving away from bus and ring based physical connections to central switch-based connections. A switching network architecture promise the higher data transmission speed and network throughput required to run multimedia applications on distributed systems. Moreover, network architectures based on fast switching technology may substantially eliminate the interprocessor communications bottleneck which plagues commercial computer networks from personal computers to supercomputer systems. Eliminating the interprocessor communications bottleneck requires network data transmission at SRAM access rates, that in turn requires the solution of two problems of approximately equal importance. These problems can best be described with reference to FIG. 1, which shows two computers or nodes, 10 and 11, connected by links to a network. Each of the computers consists of a processor (P), 12 and 13, a memory location (M) for the shared memory, 14 and 15, and a network adapter (NA), 16 and 17. The lines labeled "A", 18 and 19, connecting the memory to the network adapter on each computer, represent the bandwidth from the user memory address space to the output port of the network adapter hardware. The lines labeled "B", illustrated as 20, represent the bandwidth available through the network, that is, from the output port of the network adapter, for example 16, to the corresponding input port of another network adapter, 17 herein. Typically, the goal of interprocessor communication is to move data from one computer's memory space to one or more other computers' memory spaces. Hence, the bandwidths represented by "A" and "B" are both relevant to network performance.
The first problem is the network bandwidth problem of how to readily and reliably multicast data over a potentially large network, as would be required for updates to multiple distributed shared memory locations. Unless the bandwidth, "B", is preserved across the network, the network transmission will remain in the bottleneck for distributed systems. Once data is presented to the network, how it is routed and delivered to a designated set of network destinations is critical to eliminating network "congestion". The sharing of distributed memory eliminates memory-read communications between interconnected machines, since each site in the network has a copy of the memory. In order to maintain the most current copy of the memory at each location, however, memory-writes or updates must be multicast throughout the network whenever they occur. The multicasting of memory updates can rapidly create backlogs within the network. The second problem encountered when confronting the interprocessor communications bottleneck is the process-to-process internal bandwidth problem within each node. One needs to optimize the loading of data to and the storing of data from the cable connecting the node's memory, and hence the processor, to the network, line "A" at 18 and 19 of FIG. 1. If user processes do not have access to the full internal bandwidth, then connections to and from the network will remain in bottleneck. Much attention has been directed to solving the network bandwidth problem, resulting in most networks having the property that "B" is much larger than "A". There remains the problem of how the user processes or memory spaces on nodes and personal computers can internally take advantage of rapidly improving network performance.
Areas of interest for improving the large-scale networking of computers include addressing the following concerns: the bandwidth capabilities of the network; the process-to-process internal bandwidth available to the individual processor; the generating, broadcasting, queuing and receiving of shared memory writes; and, the interconnection of a wide variety of locations having the same or different operating systems.
A system which includes distributed nodes in a network, each node having a location for the shared distributed memory and the ability to write into the other shared memory locations is described in European Patent application number 89117024.3, entitled "Oblivious Memory Computer Networking" which was published on Mar. 28, 1990 as EPO publication number 0 360 153, and which corresponds to U.S. patent application Ser. No. 07/249,645, which issued as U.S. Pat. No. 5,276,806. The communications network which is described therein provides for any machine in the network to write into the location of the distributed shared memory of another machine. The teachings provide file registers associated with the shared memory locations, which registers contain the physical addresses of each of the corresponding locations of shared memory in each of the interconnected machines. A write operation to a linked, so-called victim, node is to the physical memory whose address is resident in the file register associated with the addressing, or host, unit. The network location is memory-mapped to the physical memory location such that the host computer "looks up" the location of the desired file and prepares an address packet to the victim location in the distributed shared memory. The 0 360 153 system improves upon the prior art by providing for the transmission of data in the network at main memory speeds, primarily through the elimination of acknowledgment messages, thereby minimizing the network bandwidth problem. By eliminating the acknowledgment mechanism, the teachings adhere to a weak consistency scheme, whereby the requirement of strict consistency among the shared memory locations is foregone. A further enhancement found in the 0 360 153 publication is that the host, or sending, processor is not required to suspend operations while the victim processor receives and processes the update. The actual network transmission and the internal memory store transaction, from network adapter, for example 16, to memory space, 14 of computer 10, is conducted without involvement, or even "awareness" of the host processor. The write process in the weak consistency model allows the host processor to be oblivious to all actions subsequent to the sending of the update. A drawback to the weak consistency model is that the updates to the shared memory location must be implemented in the order in which they are generated. The memory-mapped interrupt mechanism found in the 0 360 153 teaching is also problematic because it is non-selective. If a host wants to generate an interrupt exclusively on a single victim, e.g., the leader process, to request some network service, it must raise an interrupt on all of the interconnected network machines. The interrupt mechanism is important, as noted above, for initializing data transfers or requesting services wherein the target of the interrupt is only one processor. Although there are instances in which one would need to generate a universal interrupt, it would be desirable to have the capability of generating either selective or multicast interrupts in a network, allowing as many entities to continue operating in spite of ongoing network communications. An additional drawback to the 0 360 153 system is that in order to facilitate the memory updates, the physical memory locations are "mapped" into registers located in each of the interconnected nodes. The physical location of the shared memory cannot be changed by the local processor, thereby limiting its ability to most advantageously allocate its memory space.
A system developed by Nigel Llewelyn Vince, and disclosed on May 11, 1988 in European Patent Publication 0 092 895, based upon European Patent Application number 83311516.8, discloses a means for accessing the memory of an associated processor using a virtual address. The file registers associated with the host processor, and resident in each of the interconnected processors, contains a virtual address for each of the memory locations. The host processor assembles the write packet containing the virtual address and the update. The victim processor receives the packet and subsequently "decodes" the virtual address in order to access the physical location of the memory to be updated. Such a scheme allows each processor location to reallocate its physical memory space to optimize same, without the necessity of updating all network physical address tables. All incoming packets must be received in the order in which they were generated. Vince teaches the preferred embodiment utilizing a token ring which necessitates packet ordering. Although the extended teachings in the Vince application discuss the use of interconnection networks other than a token ring, it is required therein that the packet ordering feature be preserved, by the use of counters or other means. Such a packet ordering requirement limits the ultimate speed of an interconnection system.
The capability of virtual memory mapping to provide access to not only physical but also virtual memory addresses throughout the network can increase the efficiency of the network tremendously. Extending the virtual memory addressing concept to direct network access to a virtual shared memory would facilitate network communications and permit instantaneous memory updates without the necessity of processor intervention. In addition, under such a system, the virtual memory space can be significantly larger than the actual underlying physical memory.
It is therefore an objective of the present invention to provide virtual shared memory throughout a widely distributed network.
It is a further objective of the present invention to provide network access to a virtual shared memory atomically without processor operating system involvement.
Another objective of the invention is to provide virtual memory mapping and accessing capabilities to an existing network.
A further objective is to provide interconnection between nodes running on different operating systems and having different main memory byte orders.
Still another objective is to provide dynamic reconfiguration capabilities to an existing network.