The present invention relates to a method for data transmission between processors in a parallel processor aiming at achieving a high-speed calculation, more particularly, to a MIMD-type parallel processor having distributed memories.
There have been broadly three types of techniques for achieving a high-speed operation by using a plurality of processors.
The first type of technique is the one for structuring a parallel processor by using at least dozens of processors to achieve an extreme improvement in the performance of the operation in comparison with the processor using only one processor. It is conditional that this type of technique uses a large number of processors. Therefore, it is important to have a reduced size for each processor, which eventually has limited the function of each processor as compared with the function of a general purpose computer. For example, the compact processor for the first type of technique has omitted the address translation mechanism for realizing a virtual storage. There have been techniques of a parallel processor using a large number of processors according to which a plurality of processes can be executed by one processor, as disclosed, for example, in the JP-A-62-274451, or the corresponding EPC patent application publication No. 255,857 (application No. 87 107 576.8 (filed on May 22, 1987)) or the U.S. patent application Ser. No. 07/379,230 (filed on Jul. 13, 1989, and now issued as U.S. Pat. No. 5,301,322 on Apr. 5, 1994) following the corresponding U.S. patent application Ser. No. 07/52,871 (filed on May 22, 1987) which was abandoned. However, none of the processors according to these techniques are equipped with the function for realizing a virtual storage.
On the other hand, as the second type of technique, there is a device of a parallel computer comprising a plurality of processing elements, each having a local memory, according to which data can be written in a local memory from other processing element. When a certain processing element transmits data using a local memory of other processing element, tags are provided to a part or the whole of the words in the local memory, and these tags display whether the content of the words is valid or invalid. This type of device is discussed in, for example, the JP-A-H1-194055 or the corresponding EPC patent application publication No. 326,164 (application No. 89 101 462.3 (filed on Dec. 7, 1990)) or the corresponding U.S. patent application No. 07/303,626 (filed on Jan. 27, 1989 now issued as U.S. Pat. No. 5,297,255 on Mar. 22, 1994).
The third type of technique is a so-called distributed data processing technique for connecting a few general purpose computers with a local area network or the like. For example, the distributed data processing technique in the work station with a UNIX operating system developed and currently being licensed by the UNIX System Laboratries Ltd. corresponds to this third type of technique. In this type of technique, it is conditional that general purpose computers are used. Each processor has a structure as a general purpose computer and an adapter for communication is added to each general purpose computer. Each processor is loaded with a general purpose operating system and the communication adapter is handled as one of the resources including disk input/output units that are managed by the operating system. Therefore, communications between the processors are carried out through a system call. In other words, when a process for executing a program prepared by the user is going to transmit data to other process, it is necessary to call the program of the operating system and to carry out the program.
FIG. 30 shows the operation of data transmission according to the third type of prior art technique, for example, a reference: S. L. Leffler et al., "The Design and Implementation of the 4.3 BSD UNIX Operating System", Addison-Wesley Publishing Company, pp. 384-386. First, the process for transmitting data executes the system call for data transmission. Then, the operating system having received the system call copies the data to be transmitted, together with parameters relevant to the process of the transmission destination, to the buffer in the operating system. Then, the operating processor checks the parameters, sets the parameters and the data to the transmission circuit from the buffer area in the operating system, and commands data transmission to the transmission circuit.
When the transmission circuit has transmitted the data and the data has arrived at the receiving circuit of the processor which receives the data, the receiving circuit stores the parameters and the data in the buffer inside the receiving circuit, and makes a notice to the operating system in the receiving processor. Upon receiving the notice, the operating system in the receiving processor makes a copy of the parameters and the data from the buffer inside the receiving circuit to the buffer inside the operating system. The operating system then inspects the parameters and the data. If there is no problem as a result of the inspection, the operating system transmits an acknowledge signal (ACK) to the transmission originating processor. The processor having received the ACK releases the buffer area, inside the operating system, which was secured when there was a request for the transmission of the parameters and the data which caused the transmission of the ACK from the operating system.
When the process which is going to receive the data has executed a system call for receiving the data, the operating system in the receiving processor checks whether necessary data has all arrived already. If all the necessary data has already arrived, the operating system makes a copy of the data from the area inside the operating system which stores the data to the area inside the process which has executed the system call. If the data has not yet arrived, the operating system waits for the receiving of the data, and executes the above operation for the data receiving upon receiving the data.