The use of shared memories for multiple processors, e.g., CPU's or processors, is well known in high-speed, parallel processor computing systems. The use of such shared memory facilitates parallel processing with each CPU or processor having access to a common memory so as to expedite various CPU or processor intensive computer functions such as Fourier analysis, digital filtering algorithms, machine vision and three-dimensional graphics. The use of such a contemporary shared memory allows each CPU or processor to access those memory elements required for the CPU or processor to carry out its program instructions and data manipulations.
Such shared memories may also be configured as distributed memories, wherein the data stored therein is divided among a plurality of memory banks, so as to enable parallel access thereto and also so as to provide a fail-safe structure for the memory device. As those skilled in the art will appreciate, by distributing the data across multiple memory banks, the data may be accessed more rapidly. That is, more of the data may be simultaneously accessed by a particular CPU or processor.
A fail-safe structure for the memory device is provided since only a portion of the data is stored upon any particular parallel memory. Failure of a particular parallel memory thus results in the loss of only a portion of the data, which may typically be recovered utilizing contemporary error detection and correction methodology.
Although such contemporary shared memories have proven generally suitable for their intended purposes, they possess inherent deficiencies which detract from their overall performance. Most important among these inherent deficiencies is the inability of such contemporary shared memories to provide simultaneous access of a plurality of CPU's or processors to the data stored therein. According to contemporary methodology, when one particular CPU or processor is accessing data from the shared memories, then access to the shared memories by all other CPU's or processors is temporarily blocked. The other CPU's or processors must wait until the memory read cycle is complete before they can access the shared data. As those skilled in the art will appreciate, the blocking of access to the shared memory has a substantial adverse impact upon the computer's performance. Indeed, it has been estimated that computational efficiency is reduced to approximately 10-20 percent of its theoretical maximum value due to such memory access blocking.
The reduction in computational efficiency can more particularly be broken down into three primary causes: the transfer of data from the global memory to any of the processing elements must wait for the bus to be free so that it can accept the transfer; the memory address register must complete its current read/write cycle in order to process the next addressing request; and only a single CPU or processor can initiate an address in the prior art sequential mode of operation.
Various attempts have been made in the prior art to improve memory access. One such attempt is disclosed in U.S. Pat. No. 5,134,695 issued on Jul. 28, 1992 to Ikeda and entitled METHOD AND APPARATUS FOR CONSTANT STRIDE ACCESSING TO MEMORIES IN VECTOR processor. The Ikeda patent discloses a method for improving access to a plurality of reference memory banks, thereby enhancing memory access efficiency. However, Ikeda does not address simultaneously accessing shared memories by multiple processors and its implementation.
In view of the foregoing, it is beneficial to provide an implementation of a distributed memory addressing system wherein non-blocking access to shared memory is facilitated for multiple CPU's or processors.