1. FIELD OF THE INVENTION
The present invention relates generally to a multi-processor system and a co-processor to be used for the same. More particularly, the invention is concerned with a high-speed data processing technique in which a plurality of processor elements or co-processors are employed for performing signal processing.
2. DESCRIPTION OF THE RELATED ART
For performing data processing with a multi-processor system, it is required to allocate processor (identification) numbers to a plurality of arrayed co-processors, respectively, in order to cause the co-processors to carry out respective tasks differing from one another or to allow communication to be carried out between the co-processors and a host computer.
As the means for allocating the processor numbers to processor elements, there can be mentioned, for example, a method disclosed in a Japanese publication "Nikkei Electronics" (Apr. 9, 1984), p. 206. According to this known method, each of the processor elements is equipped with an external buffer in which the relevant processor number is placed, wherein upon initialization of the system, a one-dimensional number is loaded in an internal register incorporated in each processor element from the associated buffer. More specifically, in the case of the multi-processor system mentioned above, a plurality of imaged pipelined processors are interconnected in a ring-like configuration. Each processor element is provided with a module number register of four bits, and sixteen different module numbers are employed in total. Interconnection between the processor elements can be accomplished simply by connecting an output bus of the processor element at a preceding stage to an input bus of a succeeding processor stage and additionally providing a four-bit register for holding the module number. The numerical value of the number placed in the buffer is inherent to the associated processor and is transferred in the associated processor at the time of initialization of the multi-processor system.
FIG. 2 of the accompanying drawings shows an array diagram of processor elements (also referred to as co-processor) which cooperate to constitute a multi-processor system which was developed and examined by the inventors on the way to the present invention.
Referring to FIG. 2, a plurality of co-processors 111 to 119 each constituting a data processing unit are interconnected in parallel in the form of a m.times.n array through a system bus 102 and placed under the control of a host processor 101. For convenience of description, it is assumed that two-dimensional image data processing is to be performed with this multi-processor system. In the two-dimensional image data processing, filtering operation and other are performed for M.times.N picture elements (pixels). Accordingly, the data involved in the operations amounts to an enormous volume. Assuming, by way of example, that the filtering operation is to be performed on 2,000.times.2,000 picture elements by resorting to a neighboring operation for every set of 3.times.3 picture elements, the number of times the data processing must be performed will amount to 36,000,000 (=2,000.times.2,000.times.3 .times.3). Accordingly, when the data operations mentioned above are to be executed, it is preferred to use the multi-processor system of such a configuration as shown in FIG. 2, wherein data derived from a division of the M.times.N picture elements into a number of groups or sets are supplied, respectively, to the individual data processing elements or co-processors provided in a corresponding number, to thereby realize the simultaneous parallel operations by the co-processors. In this case, the time required for the data processing operation can be reduced to a small fraction when compared with the case where the operation is performed by the single data processing unit. In this conjunction, in order that the host processor 101 may have information about which of the processor elements or co-processors is to perform the data processing operation on which of the groups or sets of the picture elements resulting from the above-mentioned division, it is necessary that the co-processors (PE) 111 to 119 can be identified or designated discriminatively by attaching the identification numbers (referred to as the co-processor number of PE number) to the co-processors, respectively. In the approach made by the inventors in precedence to the present invention, the processor elements or PEs, that is, co-processors were labelled with serial integer numbers "1" to "mn", as illustrated in FIG. 2 in parentheses, respectively.
It should be mentioned here that when the data processing operations are to be performed uniformly over a whole region 301 of the M.times.N picture elements, as indicated by a hatched area in FIG. 3A of the accompanying drawings, sequential PE numbers may be attached one-dimensionally, since the PE numbers are, respectively, unique to the individual processor elements or co-processors. However, in the image signal processing, there often arises such a case where only a part of the region consisting of the M.times.N picture elements (i.e. sub-region) as indicated by hatched areas 302, 303 and 304 in FIGS. 3B, 3C and 3D, respectively, is required to be processed. In that case, with the one-dimensional PE number allocation mentioned above, the host processor 101 (FIG. 1) must communicate many times with the co-processors or PEs participating in the data processing on the region sharing basis in order to give instructions or commands to them. Consequently, burden imposed on the host computer is significantly increased. Besides, the quantity of communication made between the host processor and the PEs is increased, which in turn means that the burden imposed on the operating system is correspondingly increased, resulting in reduction not only in the throughput of the data processing but also in the processing speed in other operations, to great disadvantage.
By way of example, let's assume that same processing is to be performed for the partial image data area extending in the y-direction as shown in FIG. 3B. The procedure for the data processing adopted heretofore is usually as follows:
(a) The host processor is requested to transfer information such as parameters, program and others required for the processing to a plurality of co-processors or PEs which share the processing for a region to be processed. In that case, since the PE numbers are given one-dimensionally, the command for executing the information transfer must be issued as many times as the number of the processor elements participating in the processing. Consequently, a considerable amount of time will be consumed for transmission of the commands. Besides, since the host processor must constantly supervise or monitor the information transfer, the host processor is incapable of performing other tasks or jobs during the period in which the command or instruction transfer mentioned above takes place. PA1 (b) In succession to the information transfer, the host processor has to command the individual processor elements or co-processors to execute the processing as allocated thereto. At this time, a time taken for issuing the commands as many as the number of the co-processors is consumed uselessly. PA1 (a) Multi-dimensional numbers are used for the processor numbers. By virtue of this feature, two-dimensional or three-dimensional data processing or the data processing for any given partial image area can be easily shared by the co-processors (processor elements) upon processing of the image signal. PA1 (b) A particular numerical value exemplified by zero or other is used in conjunction with the processor number for the purpose of simultaneous communication with the host processor. This allows communications between the host processor and the plurality of co-processors to be conducted simultaneously. PA1 (c) A plurality of formats are prepared for the processor number so that the format can be altered in response to a command from the host processor. Due to this arrangement, the image data to be processed can be allotted to the plurality of co-processors one-dimensionally or two-dimensionally upon image processing, whereby the desired or requisite processing mode can be imparted with an enhanced flexibility.