1. Field of the Invention
The present invention relates to a memory control for a large-scale parallel processing computer system in which processors are connected in parallel to thereby process a large amount of computer operations. In particular, the present invention relates to a device for controlling a memory data path in a parallel processing computer for effectively processing a memory access request performed through an interconnection network interface and another memory access request performed through a common bus in a processor, even with a single-port memory device.
2. Description of the Prior Art
Generally, a parallel processing computer system is constructed so that tens or hundreds of processors are connected through interconnection networks to allow each of the processors to perform its particular function, and also allow operations requiring a large amount of processing to be divided by several processors for the purpose of achieving fast performance.
Such a parallel processing computer system is capable of performing a large amount of data processing in parallel so that its performance speed becomes fast enough to implement a very complicated function such as artificial intelligence.
In the parallel processing system, a common memory mode widely used in the prior art is easy in programming because it utilizes a single address.
However, in the common memory mode, a large number of processors have a common memory so that as the number of processors increases, memory data access is concentrated, thus decreasing the system's performance. Due to this defect, the common memory mode has a limitation with regards to expanding the system, and in relation to application in a large-scale system.
For this reason, there were proposed many methods which are suitable in constructing a large-scale system as well as overcoming the conventional common memory mode. Among those methods is a message passing mode.
This message mode is constructed so that a message is transmitted via an interconnection network in order to perform communication and synchronization between processors. Referring to FIG. 1 to whose configuration such a message passing mode is applicable, a predetermined number of crossbars 120, 220, 320, 420, 530, 630 and 700 are connected in a hierarchical architecture (for instance, pyramid structure).
In this hierarchical architecture, to the respective crossbars placed at the lowest level are connected tens or hundreds of processors P (in this example, only four processors 4a, 4b, 4c and 4d are shown), a system bus 5 for transmitting a request of access to the memory by the processors, and a predetermined number of processing nodes (PN) 110a, . . . 110n having a memory 8 in which data is input or output according the access request transmitted via system bus 5. Here, the connection of the crossbars is called interconnection network 1000.
A structure in which a plurality of processing nodes are connected to one crossbar placed at the lowest level is called a cluster, to emphasize the concept of a group of nodes. This is constructed with cluster -0 (100), cluster -1 (200), cluster -2 (300), and cluster -3 (400), as shown in FIG. 1. In other words, this structure illustrates that the plurality of processing nodes including tens or hundreds of processors are connected via the interconnection network in units of clusters. As shown in FIG. 1, clusters using 4.times.4 crossbars are connected hierarchically and become scalable.
In order for a one-side processing node to communicate with the other-side processing node in one cluster, merely one crossbar needs to be passed. In order for message transmission for communication with other clusters, at least two steps must be passed.
Even though the common memory mode uses a single address and is thus easily used in programming, the message passing mode is used primarily because this method improves the system's performance and thereby consumes a lower cost in maintaining synchronization, as compared with the conventional common memory mode. In addition, the message passing mode is relatively advantageous in maintaining excellent scalability characteristics.
Recently suggested processors provide a function so that a multi-processor system can be easily implemented even with simplified hardware. In order to support a large computer processing capability, several processors can be connected on one processor board. Between processors forming one processing node, a common memory mode of a symmetric multi-processor (SMP) is used.
For the connection between processing nodes, the interconnection network mode is used, and for data communication inside the processors, the common memory mode is used. This is because the transmission rate for message and delay time between connection networks required in the message transmission decrease the system 's performance in a large-scale parallel processing system using the message passing mode. In other words, it is a means for improving transmission rate and delay time. Here, in the processing node or processing element, in order to apply the common memory mode, a plurality of processors and memory are incorporated, and a network interface for connecting the interconnection network is provided.
In this structure, the memory needs to have a dual port because the processors of the processing node and the interface of the processor of other processing node connected to the network access the internal memory of the processing node bidirectionally.
This is because the memory used in the common bus mode causes the bottleneck phenomenon on the bus in case requests for using the memory are performed by several processors at the same time, and the processors create intervention therebetween and causes retry for the bus access. This decreases the system's performance. In order to prevent memory concentration, interleaving is most widely used, which can be divided into a low-order interleaving and high-order interleaving. For the SMP of the common bus mode, the high-order interleaving is most widely used.
However, in the SMP structure in which several processors are connected, several processors may access one-side memory at the same time even when memory concentration is prevented by the high-order interleaving. In this case, even while one processors accesses a corresponding memory, other processor can access the memory according to the characteristic of the pended protocol bus used as the system bus. Therefore, memory access by other processors is not permitted, but memory access itself is retried. However, the above methods cause useless transmission rate of the bus, deteriorating the bus' bottleneck phenomenon.
If a multi-stage queue is put on the input stage of the memory, other memory requests arriving while the previous memory request proceeds is sequentially stored in the multi-stage queue so that the memory is made as if it accepts the memory request sequentially.
This memory to which the multi-stage queue is coupled is used to allow one processor to try memory access while another processor tries memory access to the same memory, and in addition, compensates for the time of access to the memory which is slower than the speed of the processors.
A memory device for definitely obtaining such an effect is a dual port memory, which is applied to a variety of fields. Such dual port memory has been developed deeply, and used in a multi-stage system or fast image processing.
In other words, a dual port memory or multi-port memory, which simultaneously accepts the request of access to the memory via the interconnection network interface and the request of access to the memory via the bus of the processors in the parallel processing computer system, has been applied to a system having a multi-processor of over main frame level which is proper even though it is relatively high in cost. However, such a multi-port memory structure sharply increases its cost as more processors and memory are provided.
Such a dual port memory is already in mass production, but its memory cell has a dual port so that the request of access to the memory is processed via two ports. (see Rastegar, Bahador A dual-port memory device, Patent Number: EP 0 523 997 Al, SGS-TGOMSOM Microelectronics, Inc., 16.07.92)
Such a memory sharply increases cost as its memory capacity becomes larger. Using this memory, a system can be constructed in which a message is transmitted between processors. (see Charles, J. Roslund, Microprocessor Information Exchange with updating of Messages by Asynchronous Processors using assigned AND/OR available buffers in dual port memory, U.S. Pat. No.: 5,179,665, Westinghouse Electric corp., Nov. 13, 1989)
There is a case that the single port memory may be used like a dual port memory with an external control logic. (see James T. Koo, Controller For Dual Ported Memory, U.S. Pat. No.: 5,001,671, Viteiic Corporation, Jun. 27, 1989) However, such a memory structure does not have a queue for data separately, and thus cannot perform sequential access to the memory. Accordingly, in this memory structure, its performance cannot be expected to be enhanced.
In other words, regardless of such various problems, the above-stated dual port memory devices are undesirably expensive, as compared with the single port memory. As a result, the increase of cost becomes significant when the system is made of a plurality of memories.