This invention relates to a parallel computer having a plurality of processing elements and more particularly to a parallel computer suitable for performing the image processing operation.
Conventionally, as the construction of the parallel computer, a local memory type and shared memory type are known.
In the parallel computer of local memory type, a local memory is provided for each processing element. Therefore, each of the processing elements can make access to the local memory which belongs thereto independently from the other processing elements. However, the parallel computer has a disadvantage that each processing element cannot make direct access to the local memory which belongs to the other processing element.
In the parallel computer of shared memory type, all of the processing elements share a memory. Therefore, each of the processing elements can make direct access to the shared memory. However, the parallel computer has a disadvantage that memory access contention between a plurality of processing elements occurs and the parallel operation will be easily disturbed.
As another type of parallel computer for solving the above problems, there is provided a parallel computer having a local cache memory for each processing element and a main memory shared by all of the processing elements. Further, as still another type of parallel computer, there is provided a parallel computer which shares information between processing elements by use of crossbar switches. However, the above types of parallel computers are complicated in construction, the hardware amount increases and the control operation becomes difficult.
As one of the application fields of the parallel computer, an image processing operation is provided.
For example, in a case where the image processing operation is effected by use of a parallel computer having processing elements connected in a matrix form, a method for assigning portions of an image to the respective processing elements and causing the processing elements to process the partial images assigned thereto in parallel, thereby enhancing the speed of the image processing operation is considered. In the case of application to the image processing operation, since most memory accesses are localized to relatively nearby memory areas, it is considered effective to utilize the parallel computation in order to attain the high processing speed.
However, in order to enhance the image processing speed, the conventional parallel computer of local memory type is insufficient. The reason is that it is necessary for each processing element to use the partial image assigned to the adjacent processing element in the computation for the end portion (boundary) of the partial image assigned to itself when the image processing such as the filtering process is effected, for example. That is, since access from a processing element to a memory which belongs to the adjacent processing element is made by use of communication between the elements via the adjacent processing element, a problem that the access speed becomes low occurs.
Also, in the shared memory type parallel computer is insufficient. This is because memory accesses simultaneously occur to cause memory access contention, the parallel operation cannot be effectively performed, and the practically high operation speed cannot be attained.
Further, the parallel computer using the cache memory is not effective since the image data size is large and the hit ratio is low. In addition, the parallel computer using the crossbar switch is not effective since the hardware becomes excessively complicated.
As described above, in the conventional parallel computer, it takes a long time for memory access in the local memory type and memory access contention occurs and a satisfactory parallel operation cannot be effected in the shared memory type. Further, in the parallel computer using the cache memory or crossbar switch, a problem occurs in the hardware amount and control operation.