1. Field of the Invention
The present invention relates to a data driven type information processing apparatus and to a method of controlling execution thereof. More specifically, the present invention relates to a data driven type information processing apparatus and the method of controlling execution thereof, in which transfer rate of a self-synchronous control circuit in a router as a relay apparatus on a communication network is made different from the rate of the data driven type information processing apparatus.
2. Description of the Background Art
In a data driven type information processing apparatus (hereinafter referred to as a data driven type processor), a process proceeds in accordance with the rule that when input data necessary for executing a certain process are all prepared, and resources including an arithmetic processor necessary for that process are allocated, the process is executed. A data processing apparatus including information processing operation of the data driven type uses a data transmitting apparatus employing asynchronous handshake method. In such a data transmitting apparatus, a plurality of data transmission paths are connected, and the data transmission paths transmit/receive data transmission request signals (hereinafter referred to as SEND signals) and transfer acknowledge signals (hereinafter referred to as ACK signals) indicating whether data transfer is permitted or not, with each other, whereby autonomous data transfer is performed.
FIG. 6 represents a data packet format applied to the prior art and to the present invention. Referring to FIG. 6, a data packet includes a destination node number field F1 storing a destination node number ND#; a generation number field F2 storing a generation number GN#; an instruction code field F3 storing an instruction code OPC; and a data field F4 storing data DATA. The generation number is a number for distinguishing data groups to be processed in parallel from each other. The destination node number is a number for distinguishing input data of the same generation from each other. The instruction code is for executing an instruction stored in an instruction decoder.
FIG. 7 is a block diagram showing a configuration of the data transmission path. The data transmission path includes a self-synchronous type transfer control circuit (hereinafter referred to as a C element) 1a, and a data holding circuit (hereinafter referred to as a pipeline register) 1b including a D type flip-flop. The C element 1a has a pulse input terminal CI receiving a pulse; a transfer acknowledge output terminal RO outputting a transfer acknowledge signal indicating permission or inhibition of transfer; a pulse output terminal CO outputting a pulse; a transfer acknowledge input terminal RI receiving the transfer acknowledge signal indicating permission or inhibition of transfer; and a pulse output terminal CP for providing a clock pulse controlling data holding operation of pipeline register 1b. 
FIGS. 8A to 8E are timing charts representing the operation of the C element shown in FIG. 7. The C element 1a receives a pulse shown in FIG. 8A from terminal CI, and when the input transfer acknowledging signal such as shown in FIG. 8E provided from terminal RI represents a transfer permitted state, it outputs a pulse shown in FIG. 8D from terminal CO, and outputs a pulse shown in FIG. 8C to pipeline register 1b. In response to the pulse applied from C element 1a, pipeline register 1b holds the applied input packet data, or provides the held data as an output packet data.
FIG. 9 is a block diagram showing the data transmission path shown in FIG. 7 connected sequentially through a prescribed logic circuit. Referring to FIG. 9, an input packet data is transferred in the order of pipeline registers 3axe2x86x923bxe2x86x923c, while sequentially processed by logic circuits 3d and 3e. When pipeline register 3a is in a data holding state, for example, and the succeeding pipeline register 3b is in the data holding state, data is not transmitted from pipeline register 3a to pipeline register 3b. 
When the succeeding pipeline register 3b is in a state not holding data, or when it enters a state not holding data, the data is transmitted from pipeline register 3a, processed by logic circuit 3d and fed to pipeline register 3b with at least a preset delay time. Such a control in which data is transferred asynchronously with at least a preset delay time, in accordance with the SEND signal input/output at CI and CO terminals and ACK signals input/output at RI and RO terminals between adjacent connected pipeline registers is referred to as a self-synchronous transfer control, and a circuit controlling such a data transfer is referred to as a self-synchronous transfer control circuit.
FIG. 10 is a specific circuit diagram of the C element shown in FIG. 7. The C element is described, for example, in U.S. Pat. No. 5,373,204. Referring to FIG. 10, pulse input terminal CI receives a pulse-shaped SEND signal (transfer request signal) from a preceding stage, and a transfer acknowledge output terminal RO provides the ACK signal (transfer acknowledge signal) to the preceding stage. Pulse output terminal CO provides the pulse-shaped SEND signal to a succeeding stage, and the transfer acknowledge input terminal RI receives the ACK signal from the succeeding stage.
A master reset input terminal MR receives a master reset signal. When a pulse at the xe2x80x9cHxe2x80x9d (high) level is applied to master reset input terminal MR, it is inverted by an inverter4e, flip-flops 4a and 4b are reset, and the C element is initialized. Pulse output terminal CO and transfer acknowledge output terminal RO both output the xe2x80x9cHxe2x80x9d level signals as the initial state. That the output of transfer acknowledge output terminal RO is at the xe2x80x9cHxe2x80x9d level indicates the transfer permitted state, whereas the output being at the xe2x80x9cLxe2x80x9d level indicates a transfer inhibited state. The output of pulse output terminal CO being the xe2x80x9cHxe2x80x9d level represents a state in which data transfer from the succeeding stage is not requested, while the output being at the xe2x80x9cLxe2x80x9d level represents a state in which data transfer is requested or data is being transferred from the succeeding stage.
When the xe2x80x9cLxe2x80x9d level signal is input to pulse input terminal CI, that is, when a data transfer request is issued from the preceding stage, flip-flop 4a is set, and provides the xe2x80x9cHxe2x80x9d level signal at its output Q. The xe2x80x9cHxe2x80x9d level signal is inverted by inverter 4d, whereby the xe2x80x9cLxe2x80x9d level signal is output from transfer acknowledge input terminal RO, inhibiting further data transfer.
After a prescribed time period, the xe2x80x9cHxe2x80x9d level signal is input to pulse input terminal CI, and data set from the preceding stage to the C element is completed. When, in this state, the xe2x80x9cHxe2x80x9d level signal is input from transfer acknowledge input terminal RI, that is, data transfer is permitted by the succeeding stage, and in addition, the xe2x80x9cHxe2x80x9d level signal is output from pulse output terminal CO, that is, when data is not being transferred to the succeeding stage (data transfer request is not issued to the succeeding stage), then NAND gate 4c is rendered active, providing the xe2x80x9cLxe2x80x9d level signal.
As a result, flip-flop 4b is reset, and flip-flop 4b provides the xe2x80x9cHxe2x80x9d level signal from pulse output terminal CP to the pipeline register through a delay element 4g, and provides the SEND signal at the xe2x80x9cLxe2x80x9d level from pulse output terminal CO to the C element of the succeeding stage through a delay element 4f. More specifically, data transfer request is issued to the succeeding stage. The C element of the succeeding stage, receiving the SEND signal at the xe2x80x9cLxe2x80x9d level, outputs the ACK signal set to the xe2x80x9cLxe2x80x9d level, representing transfer inhibition, from the RO terminal, so as to prevent further data transfer to the C element. The C element receives the ACK signal at the xe2x80x9cLxe2x80x9d level from the transfer acknowledge input terminal RI, and by this signal, flip-flop 4b is reset. As a result, the xe2x80x9cLxe2x80x9d level signal is output from pulse output terminal CP to the pipeline register through delay element 4g, and the SEND signal at the xe2x80x9cHxe2x80x9d level is output from the pulse output terminal CO to the succeeding stage through delay element 4f, and thus data transfer is completed.
FIG. 11 is a schematic block diagram of a conventional data driven type information processing apparatus implemented including the data transfer path shown in FIG. 9. Referring to FIG. 11, the data driven type information processing apparatus Pe includes a junction unit JNC, a firing control unit FC, a processing unit FP, a program storing unit PS, a branching unit BRN, a plurality of pipeline registers 3a to 3c and a plurality of C elements 2a to 2c. Respective C elements 2a to 2c control packet transfer with the corresponding processing units (FC, FP, PS) by exchanging packet transfer pulses (signals at CI, CO, RI and RO) between the C elements of the preceding and succeeding stages. Respective pipeline registers 3a to 3c take in and hold data input from the processing unit of the preceding stage in response to the pulse inputs from corresponding C elements 2a to 2c, feed the data to the output stage, and hold the data until the next pulse is input.
Referring to FIG. 11, when the data packet shown in FIG. 6 is input to the processor Pe, the input packet is first passed through junction unit JNC, transmitted to firing control unit FC, and a data pair is formed between packets having the same destination node number and the same generation number. More specifically, two different data packets having identical node number and the generation number are detected, and of these two having the same numbers, one data packet is additionally stored in the data field F4 (FIG. 6) of the other data packet, and the resulting data packet is output.
The data packet storing the data pair (a set of data) in the data field F4 is then transmitted to operating unit FP. The operating unit FP receives the transmitted data packet as an input, based on the instruction code OPC of the input packet, performs a prescribed operation on the contents of the input packet, and stores the result of operation in the data field F4 of the input packet. Thereafter, the input packet is transmitted to program storing unit PS.
The program storing unit receives as an input the transmitted data packet, and reads, based on the destination node number ND# of the input packet, the node information (node number ND#) to which the packet should go, instruction information (instruction code OPC) to be executed next, and a copy flag CPY, from the program memory of the program storing unit PS. The read destination node number ND# and the instruction code OPC are stored in the destination node number field F1 and the instruction code field F3 of the input packet, respectively.
A packet output from program storing unit PS is output from the processor PE or again returned to the processor PE through a router, not shown, based on the destination node number ND#. The router is used for data packet exchange between the above described data driven type processors PEs and for input control and output control of data packets to a data driven type processor PE.
FIG. 12 is a block diagram showing an example of use of the router. In the configuration shown in FIG. 12, a plurality of data driven type processors PEs shown in FIG. 11 are connected through a router 5. When none of the data driven type processors performs a process, an input data is output as it is through router 5. When a process proceeds in the order of processor PE1xe2x86x92PE1xe2x86x92PE3xe2x86x92PE2, the input data is first provided from router 5xe2x86x925a to processor PE1, the data processed by processor PE1 is again input to processor PE1 through 5bxe2x86x92router 5 and again through 5a, the data processed by processor PE1 is fed to processor PE3 through 5b xe2x86x92router 5xe2x86x925f, the data processed by processor PE3 is input to processor PE2 through 5exe2x86x92router 5xe2x86x925c, and the data processed by processor PE2 is output through 5d xe2x86x92router 5.
FIG. 13 is a block diagram of a 2xc3x972 router used in a conventional data driven type processor. Referring to FIG. 13, the router is a 2-input, 2-output router including two branching units 6a and 6b and two junction units 6c and 6d. In the router, switching of data packets takes place, in which there are a total of four paths in the 2xc3x972 router. Namely, the data packet input to IN1 may be output from OUT1 or OUT2, and the data packet input to IN2 may be output from OUT1 or OUT2. Not only this router but also other routers described in the present invention do not guarantee that two or more inputs input simultaneously are all output simultaneously from the same output.
More specifically, in the example of FIG. 13, such an event is not guaranteed in that the data packets input simultaneously to IN1 and IN2 are both output from OUT1 or both output from OUT2.
Referring to FIG. 13, when a data packet input through IN1 is routed to OUT2 and the data packet input through IN2 is routed to OUT1, the data packet input through IN1 passes from branching unit 6a through a path 6e and transferred to junction unit 6d and output from OUT2. The data packet input through IN2 is passed from branching unit 6b through a path 6f, transferred to junction unit 6c and output from OUT1.
FIG. 14 is a circuit diagram showing an example of the branching unit shown in FIG. 13, and FIG. 15 is a circuit diagram showing an example of the branching unit shown in FIG. 2.
In FIG. 14, the branching unit is configured to have one input and two outputs, and a data packet input to the branching unit is branched to either one of the two outputs. Two junction units 6c and 6d are connected in the succeeding stage as shown in FIG. 13. Handshaking with the junction unit 6c is performed at COa and RIa, and handshaking with the junction unit 6d is performed at COb and Rib, through JTCL circuit 8, which is a control circuit controlling junction as shown in FIG. 16. Whether a data packet is to be transferred to junction unit 6c or 6d is switched by a branch permitting signal BE. As will be described with reference to FIG. 16 later, the junction unit also includes a C element.
In the branching unit shown in FIG. 14, one of the counter part C elements (C elements in the junction units 6c and 6d of the succeeding unit shown in FIG. 13) for handshaking is selected by the branch permitting signal BE. Namely, the branch destination of the data packet input to the branching unit is determined. When the branch permitting signal BE is at the xe2x80x9cLxe2x80x9d level, NAND gate 7c attains active, the output of pulse output terminal CO of C element 7a is output to the terminal CIa on the side of junction unit 6c, and the data packet in a pipeline register 7b is output to the pipeline register on the side of the junction unit 6c in the succeeding stage.
On the contrary, when the branch signal BE is at the xe2x80x9cHxe2x80x9d level, NAND gate 7d attains active, the output of the pulse output terminal CO of C element 7a is output to the terminal CIb on the side of junction unit 6d, and the data packet in pipeline register 7b is output to the pipeline register on the side of the branching unit 6d in the succeeding stage. Transfer acknowledge signals RIa and RIb from two C elements of the succeeding stage are input to AND gate 7e, and the output thereof is input to RI of C element 7a. 
FIG. 15 is a circuit diagram representing an example of the branching unit having one input and four outputs, used for forming a router. Referring to FIG. 15, at this branching unit, branch destination of a data packet is determined by branch permitting signals BEa and BEb. More specifically, when branch permitting signals BEa and BEb are both at the xe2x80x9cLxe2x80x9d level, NAND gate 7f attains active, an output of pulse output terminal CO of C element 7a is output from COa, and the data packet in pipeline register 7b is output to the pipeline register on the side of COa and RIa of the junction unit 6c in the succeeding stage.
Similarly, when branch permitting signal BEa is at the xe2x80x9cHxe2x80x9d level and the branch permitting signal BEb is at the xe2x80x9cLxe2x80x9d level, the data packet is output to COb of the junction unit of the succeeding stage; when branch permitting signal BEa is at the xe2x80x9cLxe2x80x9d level and the branch permitting signal BEb is at the xe2x80x9cHxe2x80x9d level, the data packet is output to COc of the junction unit in the succeeding stage; and when branch permitting signals BEa and BEb are both at the xe2x80x9cHxe2x80x9d level, the CO output of C element 7a is output to COd of the succeeding stage, and, in the similar manner as described above, the data packet is transferred to one of the junction units.
The branch instruction signals RIa, RIb, RIc and RId of the four C elements in the succeeding stage are input to AND gate 7j, and an output thereof is input to RI of C element 7a. 
FIG. 16 is a circuit diagram representing an example of the junction unit shown in FIG. 13. The junction unit shown in FIG. 16 is configured to have two inputs and one output and includes a JCTL circuit 8, which is a control circuit controlling junction such that simultaneous output is prevented when there are two simultaneous inputs. JCTL circuit 8 controls such that a data packet from either one of pipeline registers 8a and 8b is output. More specifically, when the pulse output terminal CPa to pipeline register 8a of JCTL circuit 8 is at the xe2x80x9cHxe2x80x9d level, the select signal AEB of selector 8e attains to the xe2x80x9cLxe2x80x9d level, and the data packet in pipeline register 8a is output from selector 8e through pipeline register 8d. 
Further, when the pulse output terminal CPb to pipeline register 8b of JCTL circuit 8 controlling junction is at the xe2x80x9cHxe2x80x9d level, select signal AEB of selector 8e attains to the xe2x80x9cHxe2x80x9d level, and the data packet in pipeline register 8b is output through selector 8e through pipeline register 8d. The control of pipeline register 8d is performed by C element 8c. 
FIG. 17 is a circuit diagram of the JCTL circuit shown in FIG. 16. In FIG. 17, JCTL circuit 8 controls pulses output to pulse output terminals CPa and CPb to pipeline registers 8a and 8b corresponding to C elements 81a and 81b. More specifically, when the pulse output terminal CPa of C element 81a is at the xe2x80x9cHxe2x80x9d level, the output AEB of a flip-flop 81c, that is, the selected signal of selector 8e shown in FIG. 16 attains to the xe2x80x9cLxe2x80x9d level. When the pulse output terminal CPb of C element 81b is at the xe2x80x9cHxe2x80x9d level, the output AEB of flip-flop 81c, that is, select signal of selector 8e attains to the xe2x80x9cHxe2x80x9d level.
The conventional router is formed to have such a structure as the example of 2xc3x972 shown in FIG. 13. When the number of data driven type processors to be connected increases in image processing, for example, and the number of processors increase, the processes become complicated. Further, as the speed of processing increases, a router having multiple inputs and multiple outputs is desirable. As an example of the router to meet such a demand, FIG. 18 shows a 4xc3x974 router. In FIG. 18, the router includes four branching units 9a to 9d, junction units 10a to 10h joining outputs from the branching units 9a to 9d, and junction units 10i to 10l for further joining outputs of junction units 10a to 10h. As compared with the 2xc3x972 router shown in FIG. 13, the circuit scale is clearly enlarged. As the number of inputs and outputs of the router increases, the circuit scale of router 5 increases explosively. Thus, a router that can cope with the demand of multi-inputs and multi-outputs and having a small circuit scale has become necessary.
FIG. 19 is a block diagram showing a 2xc3x972 router with a small circuit scale. Referring to FIG. 19, the router is formed by one of the branching units shown in FIG. 14 and one of the junction units shown in FIG. 16, and there is one path 11c from junction unit ha to branching unit 11b. Here, at the one path 11c, the data input from IN1 and IN2 at the maximum transfer rate are joined. As the transfer rate of the path 11c is the same maximum transfer rate, when the data input at the maximum transfer rate are joined, the processing capacity is overloaded. As a result, in the configuration of the router shown in FIG. 19, input is possible only at such a transfer rates in that the sum of the transfer rates of the inputs from IN1 and IN2 is equal to or lower than the maximum transfer rate.
If inputs are provided at such a rate that is lower than the maximum transfer rate, the transfer rate of the output from OUT1 and OUT2 would be also lower than the maximum transfer rate. Conventionally, the configuration of the 2xc3x972 router such as shown in FIG. 13 has been inevitable to enable routing at the maximum transfer rate without such restriction, though the circuit scale has been undesirably large.
In the future, however, a high speed transfer router that can maintain the maximum transfer rate at the junction path, namely, that can perform handshaking at a high speed, with the configuration shown in FIG. 19 suitable for multiple input-multiple-output router will be required. Thus, it is necessary to increase the speed of operation of the C elements for handshaking at the branching unit of the conventional router shown in FIG. 14 and the junction unit of FIG. 16.
Conventionally, the C element used has the same configuration as the C element used in the data driven type processor PE. The reason for this is that, to date, a 2xc3x972 router has been sufficient, and that, as the data driven type information processing apparatus of such a type is generally designed by a CAD, it is efficient and reliable to use the same macro cell or an IP core, with the C element or a peripheral circuitry including the C element being registered as a macro cell or an IP core.
As the C element of identical configuration has been used, the following problem is experienced on the side of the data driven type processor, when the speed of operation of the C element is to be increased. More specifically, when the transfer rate of the C element is increased excessively, the amount that can processed by one stage of pipelines shown in FIG. 9, that is, from one pipeline register to a pipeline register of the succeeding stage, decreases, and therefore the process must be divided into pieces. For example, the amount to be processed by a logic circuit 3d between pipeline registers 3a and 3b, or the amount to be processed by logic circuit 3e between pipeline registers 3b and 3c must be reduced. As a result, the number of stages of the pipelines increases while the amount to be processed is the same, and by the extra pipelines, the circuit scale increases. To avoid this problem, a high speed C element has been intentionally avoided in the data driven type processor.
Therefore, an object of the present invention is to provide a method and apparatus for controlling execution of a data driven type information processing apparatus in which increase in router circuit scale is suppressed without reducing an amount to be processed per one stage of pipelines, and in which transfer is possible without lowering the transfer rate of C element in the router unit from the maximum transfer rate.
Briefly stated, the present invention provides a data driven type information processing apparatus including: a router including an M-input, 1-output junction unit and a 1-input, N-output branching unit, controlling input/output of a data packet including at least a destination node number, an instruction code and data; and a self-synchronous type transfer control circuit generating a transfer request signal and a transfer acknowledge signal controlling transfer and operating processes of the data packet; wherein transfer rate used by the self-synchronous transfer control circuit of the router is different from the transfer rate used in the system.
In the conventional data driven type information processing apparatus, the speed of operation of the C element has been intentionally made slow. The router, however, is just a path not including an operator or a memory between the stages, unlike the pipelines. Therefore, it is unnecessary to intentionally suppress the transfer rate. Therefore, the C element of double rate, quadruple rate or any rate may be used. In the conventional router, the transfer rate at the junction was the same as the transfer rate before junction, and therefore it has been necessary to lower the rate of input to the junction unit to be lower than the maximum transfer rate. In the present invention, the transfer rate at the junction unit is doubled, and therefore, even by the router having only one path, input to the junction unit at the maximum transfer rate is possible, enabling output at the maximum transfer rate.
According to another aspect, the present invention provides a data driven type information processing apparatus including: a router including an M-input, 1-output junction unit and a 1-input, N-output branching unit, controlling input/output of a data packet including at least a destination node number, an instruction code and data; and a self-synchronous transfer control circuit generating a transfer request signal and a transfer acknowledge signal controlling transfer and operating processes of said data packet; in which transfer rate used in the self-synchronous control circuit in the router is different from the transfer rate used in the system.
In a preferred embodiment, in the router, the transfer rate used in the self-synchronous transfer control circuit of the router is multiple times the transfer rate used in the system.
In a preferred embodiment, the transfer rate used in the self-synchronous transfer control circuit of the router is a total sum of the transfer rates of the inputs to the router.
In a preferred embodiment, the transfer rate used in the self-synchronous transfer control circuit of the router is a total sum of the transfer rates of the outputs from the router.
In a more preferred embodiment, the transfer rate used in the self-synchronous transfer control circuit of the router is larger one of the total sum of the transfer rates of the inputs to the router and the total sum of the transfer rates of the outputs from the router.
In a more preferred embodiment, a plurality of such routers are combined.
The foregoing and other objects, features, aspects and advantages of the present invention will become more apparent from the following detailed description of the present invention when taken in conjunction with the accompanying drawings.