This invention relates to digital data processing systems and, more particularly, to multiprocessing systems with distributed hierarchical memory architectures.
The art provides a number of configurations for coupling the processing units of multiprocessing systems. Among the earlier designs, processing units that shared data stored in system memory banks were coupled to those banks via high-bandwidth shared buses or switching networks. During periods of heavy usage, bottlenecks were likely to develop as multiple processing units simultaneously contended for access to the shared data.
In order to minimize the risk of formation of transmission bottlenecks, distributed memory systems were developed coupling individual processing units with local memory elements to form semi-autonomous processing cells. To achieve the benefits of multiprocessing, some of the more recently designed systems established cell communications through utilization of hierarchical architectures.
The distributed memory systems, however, permit multiple copies of single data items to reside within multiple processing cells; hence, it is difficult insure that all processing cells maintain identical copies of like data elements. Conventional efforts to resolve this problem, i.e., to preserve data coherency, rely upon software oriented techniques utilizing complex signalling mechanisms.
To avoid processing and signalling overhead associated with these software oriented solutions, Frank et al, U.S. Pat. No. 4,622,631, discloses a multiprocessing system in which a plurality of processors, each having an associated private memory, or cache, share data contained in a main memory element. Data within that common memory is partitioned into blocks, each of which can be owned by any one of the main memory and the plural processors. The current owner of a data block is said to have the correct data for that block.
A hierarchical approach is disclosed by Wilson Jr. et al, United Kingdom Patent Application No. 2,178,205, wherein a multiprocessing system is
said to include distributed cache memory elements coupled with one another over a first bus. A second, higher level cache memory, attached to the first bus and to either a still higher level cache or to the main system memory, retains copies of every memory location in the caches below it. The still higher level caches, if any, and system main memory, in turn, retain copies of each memory location of cache below them. The Wilson Jr. et al processors are understood to transmit modified copies of data from their own dedicated caches to associated higher level caches and to the system main memory, while concurrently signalling other caches to invalidate their own copies of that newly-modified data.
Notwithstanding the solutions proposed by Frank et al and Wilson Jr. et al proposal, data coherency and bus contention remain significant problems facing both designers and users of multiprocessing systems. With respect to Wilson Jr. et al, for example, these problems may be attributed, at least in part, to the requirement that data in main memory must always be updated to reflect permanent modifications introduced to the data elements by each of the processors in the system. Moreover, neither of the proposed designs is capable of supporting more than a limited number of processing units. This restriction in "scalability" arises from a requirement of both the Wilson Jr. et al and Frank et al systems that the size of main memory must increase to accommodate each additional processor.
It is therefore an object of this invention to provide an improved multiprocessing system with improved data coherency, as well as reduced latency and bus contention. A further object is to provide a multiprocessing system with unlimited scalability.
Other objects of the invention are to provide a physically distributed memory multiprocessing system which requires little or no software overhead to maintain data coherency, as well as to provide a multiprocessing system with increased bus bandwidth and improved synchronization.