1. Field of the Invention
The present invention relates to a pipeline stop circuit for an external memory access, and in particular to an improved pipeline stop circuit for an external memory access which is capable of more effectively operating a pipeline operation when accessing an external memory or a slow internal memory.
2. Description of the Background Art
Generally, the pipeline in a processor is a bus operation process during the execution of an instruction. For example, the pipeline is implemented in the following four steps: "Instruction fetch.fwdarw.Decode.fwdarw.Operand fetch.fwdarw.Execution."
As shown in FIGS. 1A-1D, after a first instruction INT1 is fetched, a new second instruction INT2 is fetched at the time when the first instruction INT1 is being decoded, and the second instruction INT2 is decoded at the time when the first instruction INT1 is operand-fetched (control signal generation). When the first instruction INT1 is executed, the second instruction INT2 is operand-fetched. At this time, a third instruction INT3 is decoded, and a new fourth instruction INT3 is fetched. Therefore, one instruction is independently executed in one clock.
As shown in FIG. 2, the processor performing the four-step pipeline includes an instruction decoding unit 10 for decoding a program from a ROM 13 and outputting a control signal for a pipeline operation, a program counter 11 for accessing the ROM 13 in accordance with a control of the instruction decoding unit 10, a memory access unit 12 for receiving a ROM address ROM.sub.-- ADD from the program counter 11 and accessing the RAM 14 in accordance with a data memory address from the memory access unit 12 and a control from the instruction decoding unit 10, and an arithmetic operation unit 15 for computing an output data from the RAM based on the program from the ROM 13 in accordance with a control of the instruction decoding unit 10.
The operation of the conventional processor will now be explained with reference to the accompanying drawings FIGS. 3A-3G.
The program counter 11 generates a ROM address ROM.sub.-- ADD, as shown in FIGS. 3A and 3B, at every cycle of an external clock signal CLK, and the ROM 13 loads the program shown in FIG. 3C onto the program bus PBUS in accordance with a ROM address ROM.sub.-- ADD.
The memory access unit 12 accesses the data of the RAM in accordance with the ROM address ROM.sub.-- ADD from the program counter 11 and the data memory address from the memory access unit 12 and outputs to the arithmetic operation unit 15. As a result, the arithmetic operation unit 15 computes the output data from the RAM 14 based on the program outputted from the ROM 13 in accordance with a control of the instruction decoding unit 10.
Namely, the instruction decoding unit 10 receives the program from the program bus PBUS as shown by FIGS. 3C and 3D during a fist cycle t1 of the clock signal CLK and fetches the first instruction. In addition, the instruction decoding unit 10 synchronizes the fetched first instruction as shown by FIGS. 3D and 3E to the clock signal CLK, thus decoding the same and fetching the second instruction during a second cycle t2 of the clock signal CLK.
In addition, the instruction decoding unit 10 decodes the decoding signal of the first instruction and outputs a control signal. The fetched second instruction is decoded, and the third instruction is fetched.
The instruction decoding unit 10 generates a control signal with respect to the second instruction during a fourth cycle t4 of the clock signal CLK, decodes the fetched third instruction, and fetches the fourth instruction. The arithmetic operation unit 15 executes a computation process for computing an output data from the RAM 14 based on the program of the ROM 13 in accordance with a control signal with respect to the first instruction from the instruction decoding unit 10. At this time, the operational speed of the ROM 13 and RAM 14 of the processor is fast so that the same can be fully operated in each step.
However, the external memory is not accessed as fast as the same, and cannot be fully operated in each step of the pipeline due to the delay time from the processor to the external memory. Even when an internal memory is used, if the operation speed of the same is slow, the assess time is increased.
In addition, when the processor reads/writes at the same time the data into/from the external program memory and external data memory through one port, the program memory and data memory do not output the program and data at the same time. Namely, the program and data are outputted to the processor by the following sequence: the program memory.fwdarw.the data memory, or the data memory.fwdarw.the program memory, so that the access time is increased.
Therefore, the conventional processor has a disadvantage in that the pipeline operation is not properly executed due to the delayed access time when the external memory is used or the slow internal memory is used.