In a parallel processor system having a plurality of processor elements (PE) each having a processor and a local memory, it is known that access to the local memory by the processor is speeded up by providing a cache memory for the local memory of each PE. Examples of such systems are disclosed in Japanese Patent Laid-Open Nos. 150659/1992 and 168860/1992. These known examples use cache memory not only for speeding up the access to the local memory by the processor, but also for increasing the operation speed when transferring data in the local memory to other PEs. That is, when data in local memory is to be sent to other PEs and that data is held in the cache memory, the PE on the sending side requests the data from the cache memory and sends it on the network.
In Japanese Patent Laid-Open No. 150659/1992, there is no description about the processing of received data. In Denshi Joho Tsushin Gakkai Ronbunshi (Electronic Information Communications Society, Collection of Papers) D-1 Vol. J75-D-1 No. 8, 637-645 (hereinafter referred to simply as the Denshi Paper), it is disclosed that the received data is written into local memory although the disclosure about the processing of data to be sent to a destination cluster is the same as that disclosed in Japanese Patent Laid-Open No. 150659/1992. In Japanese Patent Laid-Open No. 168860/1992, when external data is received and is to be written into local memory, it is first checked whether the cache memory retains previous data. If the previous data is contained in the cache memory, it is also written over with the received data in order to make the received data available for the processor from the cache memory. When the cache memory does not contain any previous data, the received data is written into the local memory.
In Japanese Patent Laid-Open No. 150659/1992, when data is to be sent to a destination cluster, the data present in the cache memory of the processor is directly put on the network. However, no specific disclosure mentions speeding up the receive processing. In the Denshi Paper, the received data is stored in local memory and, when the processor references the data, it reads it from the local memory. Since the received data is always referenced by the processor, the overhead for data reference will very likely degrade the performance of the parallel processor system.
In Japanese Patent Laid-Open No. 168860/1992, on the other hand, since the data to be sent is referenced from the local memory, the data sending overhead necessarily increases. As to the receiving process, when the cache memory of the processor contains data to be written over with the received data, the cache memory is updated to keep the overhead for data reference by the processor small.
However, when the data in the cache memory to be written over is replaced with other data so that the data to be updated is not contained in the cache memory, the received data is written into local memory (once). The processor then references the data in the local memory. Since a generally used algorithm for replacing the cache memory is a least recently used (LRU) algorithm, there is an increased chance of cache replacement occurring. This in turn results in an increased overhead for received data reference, deteriorating the performance of the parallel processor system.