1. Field of the Invention
The present invention relates to data driven information processing apparatuses. More particularly, the present invention relates to a data driven information processing apparatus improved in program executing efficiency by multiple output instructions employing a self-synchronous transfer control circuit that enables transfer of a plurality of pulses from one pulse.
2. Description of the Background Art
In accordance with advances of multimedia in recent years, image processing, for example, requires a considerable amount of operations. As an apparatus for processing such a large amount of operations at high speed, a data driven information processing apparatus (hereinafter, referred to as a “data driven processor”) has been proposed. In the data driven processor, processing proceeds according to a rule in that certain processing is performed once all the input data necessary for the processing becomes ready and resources, such as an operation device, necessary for the processing are assigned. As a device for data processing including an information processing operation of the data driven type, a data transmission device employing asynchronous handshaking is utilized. In such a data transmission device, a plurality of data transmission paths are connected with each other, which transmit therebetween a data transfer request signal (hereinafter, also referred to as a “SEND” signal) and a transfer enabling signal (hereinafter also referred to as an “ACK” signal) indicating whether the data transfer is permitted, to realize autonomous data transfer.
FIG. 8 is a diagram showing a format of data packet to which both a conventional technique and the present invention are applied. Referring to FIG. 8, the data packet includes: a destination node number field F1 for storing a destination node number ND#; a generation number field F2 for storing a generation number GN#; an instruction code field F3 for storing an instruction code OPC; and a data field F4 for storing data DATA. Herein, the generation number is a number used to distinguish data to be processed in parallel from each other. The destination node number is a number used to distinguish input data within the same generation from each other. The instruction code is used to cause an instruction stored in an instruction decoder to be carried out.
FIG. 9 is a block diagram showing an example of a conventional data transmission device employing handshaking. Referring to FIG. 9, packet data input is sequentially processed by a logic circuit 9c while being transferred in sequence from a pipeline register 9a to another pipeline register 9b controlled by C elements 1a and 1b, respectively. In FIG. 9, at a time when pipeline register 9a is in a data retaining state, if the subsequent pipeline register 9b is also in the data retaining state, then data is not transferred from pipeline register 9a to pipeline register 9b. 
On the other hand, if the subsequent pipeline register 9b is not in the data retaining state or if it enters the state not retaining data, then data is transmitted from pipeline register 9a to logic circuit 9c and processed there before being transmitted to pipeline register 9b, taking at least a prescribed delay time. This kind of control, in which data is transmitted asynchronously between neighboring pipeline registers, according to SEND signals input/output via their CI and CO terminals and ACK signals input/output via their RI and RO terminals and taking at least a prescribed delay time, is called a self-synchronous transfer control, and a circuit controlling such data transfer is called a self-synchronous transfer control circuit.
FIGS. 10A–10E are timing charts illustrating operations of the C elements shown in FIG. 9. C element 1a receives, from the terminal CI, a pulse of an L level shown in FIG. 10A. If the transfer enabling signal (ACK) received at the terminal RI is in a state permitting the data transfer as shown in FIG. 10E, C element la outputs a pulse shown in FIG. 10D from the terminal CO and also outputs a pulse shown in FIG. 10C to pipeline register 9a. In response to the pulse applied from C element 1a, pipeline register 9a retains the received input packet data, and outputs the retained data as output packet data. Further, C element 1a outputs a pulse shown in FIG. 10B to its preceding stage.
FIG. 11 is a circuit diagram specifically showing a self-synchronous transfer control circuit. This kind of self-synchronous transfer control circuit is described, e.g., in Japanese Patent Laying-Open No. 6-83731. Referring to FIG. 11, a pulse input terminal CI receives, from its preceding unit, a SEND signal (transfer request signal) in a pulse form. A transfer enabling output terminal RO outputs an ACK signal (transfer enabling signal) to the preceding unit. A pulse output terminal CO outputs the SEND signal in a pulse form to its subsequent unit, and a transfer enabling input terminal RI receives the ACK signal from the subsequent unit.
A master reset input terminal MR receives a master reset signal. When a pulse of an H level is applied to master reset input terminal MR (not shown in FIG. 9), it is inverted by an inverter 40e, and the inverted signal resets flip-flops 40a and 40b to initialize the C element. Pulse output terminal CO and transfer enabling output terminal RO each output a signal of an H level as its initialized state. The output of transfer enabling output terminal RO at an H level means that it permits data transfer, while the output at an L level means that it prohibits the data transfer. Further, the output of pulse output terminal CO at an H level means that it is not requesting data transfer to the subsequent stage, whereas the output at an L level means that it requests the data transfer, or data is now being transferred, to the subsequent stage.
When a signal of an L level is input to pulse input terminal CI, i.e., when the preceding stage requests the data transfer, flip-flop 40a is set and outputs a signal of an H level from its output Q. This H-level signal is inverted by inverter 40d, and a signal of an L level is output from transfer enabling output terminal RO, thereby prohibiting further data transfer. After a certain period of time, a signal of an H level is input to pulse input terminal CI, and setting of data from the preceding unit to the relevant C element is terminated. In this state, if a signal of an H level is being input from transfer enabling input terminal RI, indicating that the subsequent unit is permitting the data transfer, and also if a signal of an H level is being output from pulse output terminal CO, indicating that data is not being transferred to the subsequent unit (or it is not requesting data transfer to the subsequent unit), then a NAND gate 40c becomes active and outputs a signal of an L level.
As a result, flip-flops 40a and 40b are both reset. Flip-flop 40b outputs a signal of an H level, via a delay element 40e, from a pulse output terminal CP to the corresponding pipeline register. It also outputs the SEND signal of an L level, via a delay element 40j, from pulse output terminal CO to the subsequent C element, requesting data transfer to the subsequent unit. The subsequent C element, in response to reception of this SEND signal of the L level, outputs from its RO terminal the ACK signal of an L level to prohibit further data transfer thereto. The C element receives this ACK signal of the L level from transfer enabling input terminal RI. This signal causes flip-flop 40b to be set. As a result, a signal of an L level is output, via delay element 40e, from pulse output terminal CP to the pipeline register, and the SEND signal of an H level is output, via delay element 40j, from pulse output terminal CO to the subsequent unit. Thus, the data transfer is terminated.
FIG. 12 is a schematic block diagram showing a conventional data driven processor incorporating the data transfer control circuit shown in FIG. 11. Referring to FIG. 12, the data driven processor Pe includes a junction unit JNC, a firing control unit FC, an operation unit FP, a program storage unit PS, a branch unit BRN, a plurality of pipeline registers 4a–4c, and a plurality of C elements 2a–2c. C elements 2a–2c each control packet transfer for its corresponding processing unit (FC, FP or PS) by sending/receiving packet transfer pulses (signals on CI, CO, RI, RO) to/from the preceding and subsequent C elements. Each pipeline register 4a–4c, in response to a pulse input from its corresponding C element 2a–2c, takes in and retains data received from the preceding processing unit, and delivers the data to its output stage. The data is retained until arrival of a next pulse.
Referring to FIG. 12, assume that processor Pe receives a data packet as shown in FIG. 8. The input packet is first passed through junction unit JNC to firing control unit FC, and paired data is formed with packets based on their destination node numbers ND# and generation numbers GN#. More specifically, two data packets with the same node number ND# and generation number GN# but having different data are detected, the data in one of the two data packets is additionally stored in the data field F4 (FIG. 8) of the other data packet, and the resultant other data packet is output. This data packet having the paired data (a set of data) stored in its data field F4 is then transmitted to operation unit FP. Operation unit FP inputs the data packet transmitted, performs a prescribed operation on the content of the input packet based on the instruction code OPC within the input packet, and stores the operation result in data field F4 of the packet. This input packet is then transmitted to program storage unit PS.
Program storage unit PS inputs the data packet transmitted, and based on destination node number ND# of the input packet, reads out a higher level destination node number ND#, a higher level instruction code OPC and a copy flag CPY from a program memory stored within program storage unit PS. The destination node number ND# and instruction code OPC thus read out are stored respectively in destination node number field F1 and instruction code field F3 of the input packet. Further, if the copy flag CPY read out is “1”, it is determined that a higher level address within the program memory is also valid, and thus, a packet storing destination node number ND# and instruction code OPC stored in the higher level address is also generated.
The packet output from program storage unit PS is transmitted to branch unit BRN, from which it is either output based on its destination node number ND# or returned into the processor again. If three copies of the identical data are required, the packet returned into the processor is subjected to a copying process. It means that, to make several copies of the identical data, the packet should be returned into the processor several times to repeat the copying process.
In the above-described data driven information processing apparatus, a debug device is desired, when the data flow program does not work as expected due to some failure, to efficiently discover a cause of error, which can be found at a hardware level of the apparatus itself, or at a software level of the data flow program being executed thereon.
Such debug device for a data driven information processing apparatus is disclosed, e.g., in Japanese Patent Laying-Open No. 5-151370 entitled “Data Driven Type Computer”. The debug device disclosed therein is capable of locally suspending and then resuming execution of a data flow program at a node corresponding to a designated line of a source program in order to understand the execution state of the program.
This debug device utilizes program converting means for adding a NOP instruction (instructing to cycle performing no operation) to a node corresponding to an end point of each line of the source program, and when designated to stop the program at a line where the processing of the source program is desired to be stopped, it outputs the data flow program with the corresponding NOP instruction changed to an output instruction. The debug device is provided with a breaking function to temporarily stop the program at a designated point of the program, perform a certain operation (e.g., reading the content of the memory) on the data driven type computer, and restart the program from the stopped point. Thus, it is capable of acquiring state information within the data driven type computer (e.g., the content of the memory) at the stop of the program and resuming the execution of the program from the same state as the time of the stop.
To implement the debug function as described above, however, the NOP operations should be added to respective end points of all the lines within the source program. Consequently, a number of wasteful cyclic packets are generated corresponding to the number of NOP instructions added, which decreases the processing speed of the program. Furthermore, addition of a node corresponding to the NOP operation causes changes to occur in the order of arrival of the data packets at a queue space for firing control unit FC and in the degree of congestion of the data packets on a circular pipeline over time, thereby hindering reproduction of expected operations of the source program.