1. Field of the Invention
The present invention relates to a data flow processor adopting a system to drive processing on the basis of interdependence between the data to be processed.
2. Description of Related Art
A typical example of the conventional data flow processor is disclosed in Pages 19 to 22, digest of technical papers pets of lecture of Development Result Presentation Conference for Scientific Technique High Speed Computation System Research (June 25, 1984) and Pages 486 to 490, digest of technical papers (in English) of IEEE COMPCON '84 SPRING. These public documents all describe construction and operation of the data flow processor called the Sigma-1. The prior art will be described on the basis of these public documents as follows:
FIG. 1 is a block diagram of the above-mentioned conventional data flow processor.
The data flow processor comprises a network interface 14 of interface with the exterior of the processor, a matching memory 10 for detecting from the data to be processed two data coincident with each other in the destination node number, an instruction fetching unit 15 for fetching the operation instruction code pointed by the node number, an instruction executing unit 12 for executing the processing in accordance with the instruction code, a destination designating unit 16 for designating the next destination node number of the data after processed, and data paths connecting therethrough these units.
FIG. 2 shows an example of a program (data floe graph) for explaining concrete operation of the conventional data flow processor.
The data flow graph is so constituted that data A and B are added at the node #0, the resultant data G is multiplied by another data C at the node #2 to obtain the resultant data I, while, data D and E are added at the node #1, the resultant data H is multiplied by another data F at the node #4 so as to obtain the resultant data J . . . , data W is multiplied by X at the node #7096, the resultant data Y and the previously obtained data I are added at the node #7097 so as to obtain the resultant data Z, and so on.
A packet A(201) (data having the tag information) inputted from the exterior through the network interface 14 comprises destination node number &lt;0&gt; and data [A]. The packet 201 is delivered to the matching memory 10, and simultaneously the destination node number &lt;0&gt; (203) in the packet 201 is sent also to the instruction fetching unit 15. Assuming that a packet B having data [B]of the object to the dyadic operation has already been inputted and waits in the matching memory 10, the destination nodes of the two packets A and B are coincident with each other at the node #0, so that the matching memory 10 outputs a packet 206 of a pair of data [A]and [B].
On the other hand, at the instruction fetching unit 15, an instruction code "+"(205) of the content of address corresponding to the node number &lt;0&gt; is read and outputted.
Memory configuration of the instruction fetching unit 15 is shown in FIG. 3(a).
Next, the destination node number &lt;0&gt; inputted to the instruction fetching unit 15 is delivered intact to the destination designating unit 16 as the node number &lt;0&gt; (204). The instruction code "+"(205) read out at the instruction fetching unit 15 together with the packet 206 of data pair [A] and [B] is given as a packet 207 to the instruction executing unit 12, and at that time at the destination designating unit 16 is accessed and delivers the code &lt;2&gt; (208) representing the node #2 to which the resultant data of addition of operation corresponding to the node number &lt;0&gt; in the data flow graph shown in FIG. 2 is to be sent.
Memory configuration of the destination designating unit 16 is shown in FIG. 3(b).
Simultaneously, the instruction executing unit 12 computes [A]+[B] and the operation resultant data [G](209) is outputted therefrom. The output 208 of the destination designating unit 16 and the output 209 of the instruction executing unit 12, in other words, the destination node number &lt;2&gt; and data [G], pass as a packet 210 through the network interface 14 and is sent again to the instruction fetching unit 15 and matching memory 10.
The aforesaid chain of processes executes operations corresponding to all the nodes of data flow graph shown in FIG. 2 and execution of program ends, and at that time the node where the relation of interdependence between data exists, for example, the processings at the nodes #0 and #2 are executable sequentially only in the order of the above. However, the node having no relation of inter dependence between data, for example, the processing at the nodes #0 and #1 is executable in parallel as far as the processing resource allows.
In addition, the relation of interdependence between data means that, in a relation between the two nodes, the processing at one node is completed so that input data required to process the other is supplied for the first time.
Memory configuration of the instruction fetching unit 15 and destination designating unit 16 are shown in FIG. 3(a) and FIG. 3(b). Since a bit width of each information is not clarified in the aforesaid public document, a bit width of instruction code is assumed to be 4 bits and that of destination node number to be 16 bits. Also, since the document does not concretely describe the data copy operation, the data copy is assumed to use a widely well-known method.
A most significant bit "COPY" in the memory within the destination designating unit 16 is copy bit, which, in the case where the operation result at the node in the data flow graph shown in FIG. 2 is sent to a plurality of nodes, executes the data copy to give the destination node number to each data. For example, the result of operation at the node #2 is sent to both the nodes #7097 and #5, so that a logical "1" is stored in a second address copy bit in the memory corresponding to the result. In this case, the destination node number #7097 is read and concatenated to the result of operation [1] and outputted, and thereafter the next address (a third address) is continuously read and then the destination node number #5 is concatenated to the same result of operation [1] and outputted.
The aforesaid conventional data flow processor has a problem in that the size of program memory is larger.
A von Neumann type computer now widely used carries out processing sequentially in the order described in the program, whereby one register so called the program counter manages executing addresses in a unifying manner. Accordingly, the address of the next instruction (jump address) to be executed except for a branch instruction need not be particularly designated, and the size of program memory smaller than the data flow processor can store program of the same contents. On the contrary, in a case of data flow processor, since the respective instructions are processed in parallel, it is impossible to manage the executing address in a unifying manner. Therefore, the data flow processor is essentially required to designate address of instruction to be next executed (destination node number) with respect to all the instructions, which causes size of program memory to be enlarged. In the aforesaid conventional example, 16 bits among 21 bits (instruction code 4 bits+destination node number 17 bits) in bit width per one instruction are occupied by the destination node number.