This invention relates to digital data communications networks and more particularly to packet switched networks for use, for example, with high-speed distributed multiprocessing systems.
Multiple instruction/multiple data (MIMD) parallel processing computers can be categorized according to address space, memory organization and memory hierarchy. With respect to the former, a system may be characterized as utilizing a single or a multiple address space. A single address space is commonly referred to as shared memory and implies implicit communication as part of any memory access. A multiple private address spaces implies explicit communication by, for example, message passing.
Memory organization may be characterized as centralized or distributed. In a system based on centralized memory, a common memory element is centrally located, with access time to any physical memory location being the same for all processors. In a distributed memory system, on the other hand, system memory is divided into modules with some placed near each processor.
Memory hierarchy may be characterized as static, mixed or dynamic. A hierarchy is classified as static where system data is divided among local and/or global memory units, with each datum having an explicitly assigned address based at least in part on the locale of the unit in which it is stored. In a fully dynamic system, on the other hand, individual datum are not assigned physical memory location-based addresses by which they are accessed. Mixed memory hierarchies contain static local and/or global memory, where a portion of the memory hierarchy is dynamic (e.g., a cache memory) and another portion is static.
The art provides a number of architectures where a single address space, i.e., shared memory, is organized in a centralized manner. Processing units in such systems communicate via high-bandwidth shared buses or switching networks. A problem with such architectures is that the shared memory forms a bottleneck, impeding system performance, except in instances where there are relatively few processors.
To avoid this problem, Frank et al, U.S. Pat. No. 4,622,631, discloses a multiprocessing system in which a plurality of processors, each having an associated private memory, or cache, share data contained in a main memory element. Data within that common memory is partitioned into blocks, each of which can be owned by any one of the main memory and the plural processors. The current owner of a data block is said to have the correct data for that block.
While the solution suggested in the aforementioned patent permits an increase in the number of processors which can be supported without a bottleneck, the centralized memory system is not scalable.
In order to achieve scalability, a distributed memory organization must be used since it theoretically allows parallel high bandwidth access to memory to grow in proportion to the number of processor and memory modules. The art provides two alternative programming models for such an organization: the multiple address space model and the single address space module. Both models present a dilemma for computer system designers and programmers.
From the programmers viewpoint, the single address architecture is a simpler programming model since data movement is implicit in memory operations. Whereas, in the multiple address architecture explicit message passing is required for a multiple address architecture. The multiple address architecture, moreover, requires explicit data localization, explicit memory allocation and deallocation, explicit replication and explicit coherency to produce a correct parallel program. These aspects are theoretically handled implicitly in a single address architecture.
As discovered by designers in the prior art, from a hardware perspective simultaneous access to a single address space, i.e., a logically shared memory, is prohibitively expensive. This is due to the complexity of the switching network, as well as the lack and potential complexity of a general high performance solution for memory coherency. In consequence, most scalable distributed memory systems such as Intel/IPSC have historically implemented a multiple address architecture.
The performance of a single address architecture is dependent on the memory hierarchy. A static memory hierarchy requires the programmer to explicitly manage the movement of data for optimal performance in a manner similar to the multiple address architecture. Two examples of a distributed memory organization which implement a single address architecture using a static memory hierarchy are the BBN Butterfly and IBM RP3. Such implementations require the programmer to explicitly manage coherency.
Mixed hierarchies necessarily include communication bottlenecks which limit scalability, while still requiring the programmer to partially manage data movement. One such hierarchical approach is disclosed by Wilson Jr. et al, United Kingdom Patent Application No. 2,178,205, wherein a multiprocessing system is said to include distributed cache memory elements coupled with one another over a first bus. A second, higher level cache memory, attached to the first bus and to either a still higher level cache or to the main system memory, retains copies of every memory location in the caches, if any, and system main memory, in turn, retain copies of each memory location of cache below them. The Wilson Jr. et al processors are understood to transmit modified copies of data from their own dedicated caches to associated higher level caches and to the system main memory, while concurrently signalling other caches to invalidate their own copies of that newly-modified data.
Notwithstanding solutions proposed by the prior art, none has achieved a high performance, fully dynamic, coherent shared memory programming environment with unlimited scalability.
An object of this invention is to provide such a system. More particularly an object of the invention is to provide a multiple instruction/multiple data parallel processing system utilizing a shared memory addressing model and a distributed organization with improved coherency.
A further object is to provide a fully dynamic memory hierarchy achieving high performance within a distributed system utilizing a shared memory address model.
A still further object is to provide an improved digital data communications network.
Yet another object is to provide a packet switch network for use, for example, in high-speed distributed multiprocessing systems.
Still another object is to provide an improved switching mechanism for use in routing data and data requests through a digital communications network.