1. Field of the Invention
The present invention relates to an inter-device coupler which arbitrates data transfer between devices operating at different speeds, more particularly relates to a coupler of an arithmetic logical unit (ALU) and a memory for a memory closely coupled with an ALU.
2. Description of the Related Art
In a processor as a coupler for closely coupling an ALU and a memory, the write and read speed of the memory often becomes a bottleneck.
Therefore, in the related art, the practice has been to make the operating speed of the ALU twice the memory access speed, provide a coupler between the ALU and memory, perform serial-parallel conversion and parallel-serial conversion, and prevent a fall in the bandwidth even if the access speed of the memory is low.
FIG. 1 is a circuit diagram of an example of the configuration of a coupler of an ALU and a memory of the related art.
The coupler 10 comprises, as shown in FIG. 1, positive-edge D-type flip-flops 11, 12, 13, 14, and 19, memories 15 and 16, negative-edge D-type flip flops 17 and 18, and a two-input one-output selector 20. Reference number 21 indicates the ALU.
In the coupler 10, a clock signal CK1 is supplied to the flip-flops 11, 12, and 19, while a clock signal CK2 is supplied to the flip-flops 13, 14, 17, and 18.
An output signal ALUOT of the ALU 21 is supplied to an input D of the flip-flop 11. An output signal OTD1 from the output Q of the flip-flop 11 is supplied to inputs of the flip-flops 12 and 13, respectively, while an output signal OTD0 from an output Q of the flip-flop 12 is supplied to an input D of the flip-flop 14.
The output signals OT1 and OT0 from the outputs Q from the flip-flops 13 and 14 are respectively supplied to write ports of memories 15 and 16, while read signals IN1 and IN0 from read ports of the memories 15 and 16 are supplied to inputs D of the flip-flops 17 and 18.
An output signal IND0 from an output Q of the flip-flop 18 is supplied to a port A of the selector, an output signal IND1 from an output Q of the flip-flop 17 is supplied to an input D of the flip-flop 19, and an output signal IND2 from an output Q of the flip-flop 19 is supplied to a port B of the selector 20. An output signal from the selector 20 becomes an input signal ALUIN of the ALU 21. A selection signal of the selector 20 is made OSEL.
Assuming that the CK1 is a normal clock signal, the CK2 is a clock obtained by frequency-division of CK1.
The memories 15 and 16 are written into at rising edge of the clock signal CK2 and read from at a trailing edge of the clock signal CK2. It takes three cycles of the clock signal CK1 from the writing to reading due to the nature of the memories.
Next, an operation of the coupler 10 of an ALU and a memory of the related art will be explained with reference to timing charts of FIGS. 2A to 2N.
FIGS. 2A to 2N are timing charts of the case when immediately reading data written in a memory and transferring it to the ALU.
First, as shown in FIG. 2C, data streams n0, n1, n2, n3 . . . are output as a signal ALUOT from the ALU 21.
As shown in FIGS. 2D and 2E, the n0, n1, n2, n3 . . . are output respectively delayed by one cycle and two cycles of the clock signal CK1 from the flip-flops 11 and 12.
At this time, since the phase relationship of the clock signals CK1 and CK2 is set as shown in FIGS. 2A and 2B, outputs from the flip-flops 13 and 14 are delayed by four cycles from the input of n0 to the flip-flop 11.
The data is written in the memories 15 and 16 and sent to the flip-flops 17 and 18 after three cycles.
An output of the flip-flop 17 is delayed exactly by one cycle in the flip-flop 19 and output to the port B of the selector 20.
By changing the selection signal OSEL of the selector 20 by the timing shown in FIG. 2M, data of n0, n1, n2 . . . from the selector 20 is output from the ALU 21.
Summarizing the problem to be solved by the invention, in the coupler 10 of the related art, a delay of 7 cycles was required between writing data of the ALU 11 in a memory and reading it again from the memory.
Accordingly, in the coupler 10 of an ALU and memory of the related art, the delay becomes long when temporarily writing output data from the ALU 21 and using the same immediately after the writing. There is a period when no computations are possible until the read data becomes usable in the ALU 21.
The reason why the delay becomes 7 cycles is that two cycles are needed for changing a clock from CK1 to CK2, three cycles for writing and reading to and from the memory, and two cycles for switching the clock from CK2 to CK1.
One method for solving this problem is to provide more registers inside the ALU, but the connections to the registers become complex, the control circuit also becomes complex, and furthermore the power consumption in the clock system increases because it has to be always operated by the CK1 clock.
An object of the present invention is to provide an inter-device coupler capable of giving any delay to output data of an ALU and outputting the result as input data of the ALU with a simple configuration and without increasing the power consumption.
To attain the above object, according to a first aspect of the present invention, there is provided an inter-device coupler for arbitrating data transfer between an ALU and a memory operating at different speeds, comprising a first input circuit operating at the same speed as the ALU and receiving as input and outputting output data of the ALU; a second input circuit operating at the same speed as the memory and receiving as input and outputting output data of the first input circuit to the memory; a first output circuit operating at the same speed as the memory and receiving as input and outputting read data of the memory; a second output circuit operating at the same speed as the ALU and outputting input data to the ALU; and a path selector for inputting at least one of the output data of the first input circuit or the output data of the first output circuit to the second output circuit in accordance with a value of a path selection signal.
Preferably, the path selector inputs the output data of the first input circuit, the output data of the second input circuit, or the output data of the first output circuit to the second output circuit in accordance with the value of the path selection signal.
According to a second aspect of the present invention, there is provided an inter-device coupler for arbitrating data transfer between an ALU and a memory operating at different speeds, comprising a first input circuit operating at the same speed as the ALU and receiving as input and outputting output data of the ALU; a second input circuit operating at the same speed as the memory and receiving as input and outputting output data of the first input circuit to the memory; a first output circuit operating at the same speed as the memory and receiving as input and outputting read data of the memory; a selection circuit operating at the same speed as the memory, comprising a first input and a second input, and selecting an input signal for the first input or an input signal for the second input and outputting the same to the ALU in accordance with an output selection signal; a second output circuit operating at the same speed as the ALU and outputting input data to the ALU; and a path selector for inputting at least one of the output data of the first input circuit or the output data of the first output circuit to the first input of the selection circuit or the second output circuit in accordance with a value of a path selection signal.
Preferably, the path selector inputs the output data of the first input circuit, the output data of the second input circuit, or the output data of the first output circuit to the first input of the selection circuit or the second output circuit in accordance with the value of the path selection signal.
According to a third aspect of the present invention, there is provided an inter-device coupler for arbitrating data transfer between apparatuses operating at different speeds, comprising n number (n is an integer of 2 or more) of memories; an ALU operating at a speed n times that of the memories; n number of first input circuits operating at the same speed as the ALU, having cascade connected inputs and outputs, and receiving as input and successively transferring the output data of the ALU; n number of second input circuits operating at the same speed as the memories, provided corresponding to the n number of first input circuits, receiving as input output data of corresponding first input circuits, and outputting the same to corresponding memories; n number of first output circuits operating at the same speed as the memories and receiving as input and outputting read data of corresponding memories; a second output circuit operating at the same speed as the ALU and outputting input data to the ALU; and a path selector for inputting at least one of the output data of the initial said first input circuit or the output data of the n number of first output circuits to the second output circuit in accordance with a value of a path selection signal.
Preferably, the path selection circuit inputs any one of the output data of the initial said first input circuit among the n number of first input circuits, the output data of n number of second input circuits, or the output data of n number of first output circuits input to the second output circuit in accordance with a value of the path selection signal.
According to a fourth aspect of the present invention, there is provided an inter-device coupler for arbitrating data transfer between apparatuses operating at different speeds, comprising n number (n is an integer of 2 or more) of memories; an ALU operating at a speed of n times that of the memories; n number of first input circuits operating at the same speed as the ALU, having cascade-connected inputs and outputs, and receiving as input and successively transferring the output data of the ALU; n number of second input circuits operating at the same speed as the memories, provided corresponding to the n number of first input circuits, receiving as input output data of corresponding first input circuits, and outputting the same to the corresponding memories; n number of first output circuits operating at the same speed as the memories and receiving as input and outputting read data of corresponding memories; a selection circuit operating at the same speed as the ALU, comprising a first input and a second input, and selecting an input signal for the first input or an input signal for the second input and outputting the same to the ALU in accordance with an output selection signal; a second output circuit operating at the same speed as the ALU for outputting input data to the second input of the selection circuit; and a path selector for inputting at least one of the output data of an initial said first input circuit or the output data of the n number of first output circuits to the first input of the selection circuit or the second output circuit in accordance with a value of a path selection signal.
Preferably, the path selector inputs any one of the output data of the initial said first input circuit, the output data of the n number of second input circuits, or the output data of the n number of first output circuits to the first input of the selection circuit or the second output circuit in accordance with a value of a path selection signal.
Alternatively, the path selector gives a delay of a predetermined number of cycles of the memory to the input data and outputs the result to the first input of the selection circuit and the second output circuit in accordance with a value of the path selection signal.
According to the present invention, the output data of an ALU is input to a first input circuit operating at the same speed as the ALU and output to a second input circuit operating at the same speed as the memory and a path selector.
The second input circuit fetches the output data of the first input circuit and outputs it for example to the memory and path selector.
As a result, the data is stored in the memory. Then, the data stored in the memory is read at a predetermined cycle and output to the first output circuit operating at the same speed as the memory.
The first output circuit fetches the data read from the memory and outputs it to the path selector.
The path selector is supplied with a path selection signal and selectively connects one path of data among the output data of the first input circuit, the output data of the second input circuit, and the output data of the first output circuit to a path to an input to the second output circuit in accordance with the value of the signal.
The selected data is input to the second output circuit operating at the same speed of the ALU as it is or delayed by a predetermined number of cycles of the memory and output to the ALU.
Alternatively, when there is a selection circuit, the path selector is supplied with a path selection signal and selectively connects one data path from among the output data of the first input circuit, the output data of the second input circuit, and the output data of the first output circuit to a path to the first input of the selection circuit or the input of the second output circuit in accordance with the value of the signal.
Then, the selected data is input as it is or delayed by a predetermined number of cycles of the memory to the selection circuit directly or via the second output circuit at the same speed as the ALU and output to the ALU.
Alternatively, when the ALU operates at a speed n times that of the memory, the output data of the ALU is input to an initial first input circuit operating at the same speed as the ALU and successively output to later first input circuits.
The input data is output from the initial first input circuit to a corresponding second input circuit operating at the same speed as corresponding memories and the path selector.
Also, the output data of each of the second first input circuit on is respectively output to the corresponding second input circuit operating at the same speed as the memories.
The n number of second input circuits fetch the output data of the corresponding first input circuits and output the same to for example the memories and path selector.
As a result, the data is stored in the memories. Then, the data stored in the n number of memories is read at predetermined cycles and output to the first output circuits operating at the same speed as the memories.
The n number of first output circuits fetch data read from the memories and output the same to the path selector.
The path selector is supplied with a path selection signal and selectively connects one data path among the output data of the initial first input circuit, the output data of n number of second input circuits, and the output data of n number of first output circuits to a path to the input of the second output circuit in accordance with a value of the signal.
Then, the selected data is input delayed by a predetermined number of cycles of the memory to the second output circuit operating at the same speed as the ALU and output to the ALU.
Alternatively, when there is a selection circuit, the path selector is supplied with a path selection signal and selectively connects one data path among the output data of the initial first input circuit, the output data of n number of second input circuits, and the output data of n number of first output circuits to a path to the first input of the selection circuit or the input of the second output circuit in accordance with the value of the path selection signal.
Then, the selected data is input delayed by a predetermined number of cycles of the memory to the selection circuit directly or via the second output circuit at the same speed as the ALU and output to the ALU.