1. Field of the Invention
The present invention relates to a parallel processing system having a plurality of arithmetic processing units, and, more particularly, to a parallel processing system in which each of plural arithmetic processing units is linked to storage units by networks.
2. Description of the Prior Art
Examples of such a system according to the prior art include a parallel processing system using crossbar switches for switching inputs to outputs according to destination addresses, described in the Gazette of PCT Patent No. WO 91/10183. In such a parallel processing system according to the prior art, every destination address is generated in advance when each packet is to be routed.
A parallel processing system to which the present invention is applicable, as illustrated in FIG. 1, has a configuration in which a plurality of arithmetic processing units 1100 to 1400 are linked to a plurality of storage units 4100 to 4400 by networks 2100 to 2400 and 3100 to 3400 or 5100 to 5400 and 6100 to 6400. Here, each arithmetic processing unit is supposed to have a plurality of vector processors, and each storage unit is supposed to have a plurality of memory modules.
A memory access request issued from a vector processor in an arithmetic processing unit is transferred by a first stage network and a second stage network to a memory module in a designated storage unit. If the access request is the read-out from the memory, the read data are transferred by a third stage network and a fourth stage network to the requesting vector processor.
Referring to FIG. 8, according to the prior art, the format of a packet 1901 to be sent from the arithmetic processing units 1100 to 1400 to the first stage networks includes a field indicating that the packet concerns a memory access request, such as to read out from the memory, and control information for use at each stage of networks. Thus, the packet 1901 of the prior art includes control information for each of the first stage networks 2100 to 2400, the second stage networks 3100 to 3400, the third stage networks 5100 to 5400 and the fourth stage networks 6100 to 6400 as network control information. Incidentally, other fields including the write data field and the address in the memory module are omitted for the convenience of illustration.
The format of a packet 2901 to be sent from the first stage networks 2100 to 2400 to the second stage networks 3100 to 3400 has the same fields as the packet 1901 except that it does not include control information for the first stage networks 2100 to 2400 as network control information.
Further, the format of a packet 3900 to be sent from the second stage networks 3100 to 3400 to the storage units 4100 to 4400 has the same format as the packet 2901 except that it does not include control information for the second stage networks 3100 to 3400 as network control information.
After the storage units 4100 to 4400 receive the memory access request, the storage units 4100 to 4400 will write data in the write data field into the prescribed address in the memory module if it is a request to write into the memory. On the other hand, if the requested access is the read-out from the memory, they will return the read data to the arithmetic processing unit as reply data. A packet 4900 to the third stage networks 5100 to 5400 for this read data return includes control information for each of the third stage networks 5100 to 5400 and the fourth stage networks 6100 to 6400 as network control information. Further, a packet 5900 to be sent from the third stage networks 5100 to 5400 to the fourth stage networks 6100 to 6400 includes control information for the fourth stage networks 6100 to 6400 as network control information.
Then, the fourth stage networks return the reply data to the vector processor which issued the memory read request.
Thus in the parallel processing system according to the prior art, all the network control information for use when reply data are to be returned from the storage unit, i.e. the data returning side, is generated by the vector processor unit, i.e. on the request issuing side. Furthermore, as the prior art involves the sending of network control information added to packets, there is the problem that interfacing between the networks is expanded and the number of flip-flops for holding control information increased, resulting in greater complexity of network control.