A central processing unit (CPU) which performs a pipeline processing method is known (for example, COMPUTER ORGANIZATION & DESIGN, 1999, pp 430 to 433, by David A. Patterson; John L. Hennessy). In this CPU, the processing function of reading instructions from a memory and executing the same is divided into a plurality of kinds of stages. Operations of different stages are executed in parallel thereby to partially execute a plurality of instruction processing cycles in the overlapped timings.
FIG. 5A is a schematic illustration of a computer system 101 which is formed with a CPU 102, which is connected to a ROM 104, a RAM (not illustrated), and other peripheral devices via an address bus 106 and a data bus 107.
Here, the CPU 102 will be explained as an example of a CPU which can execute ordinary pipeline processes of five stages. In this CPU 102, a process function is divided into five stages of IF (instruction fetching operation: fetch operation), DEC (instruction decoding), EXE (instruction executing), MA (memory accessing), and WB (writing back) and the pipeline process is executed in such a parallel relationship as illustrated in FIG. 5B.
Namely, in the IF stage, an instruction is read from a memory (mainly, ROM) in which programs are stored in advance, while in the DEC stage, the instructions fetched in the IF stage are decoded. In the EXE stage, arithmetic operations among registers are executed based on the contents decoded in the DEC stage. Moreover, in the MA stage, access (read or write operation) is executed based on the contents decoded in the DEC stage to the memories of ROM and RAM or the like using the results of the arithmetic operations in the EXE stage as the address. In the WB stage, data is written into registers based on the contents decoded in the DEC stage.
High speed instruction processes are realized by performing, in parallel, operations of different stages of a plurality of instruction process cycles. Specifically, when the n-th instruction is read from the memory in the IF stage of the n-th instruction process cycle and the n-th instruction is decoded in the DEC stage of the n-th instruction process cycle, the (n+1)-th instruction is read from the memory of the IF stage of the (n+1)-th instruction process cycle. Moreover, when the arithmetic operation is executed based on the n-th instruction in the EXE stage of the n-th instruction process cycle, the (n+1)-th instruction is decoded in the DEC stage of the (n+1)-th instruction process cycle and the (n+2)-th instruction is read from the memory of the IF stage of the (n+2)-th instruction process cycle.
In addition, all stage operations are executed synchronously with the system clock of the CPU (operation clock of the CPU) and one stage operation is basically completed within one period T of the system clock.
Meanwhile, in the CPU 102, the length of IF stage (namely, execution period of the IF stage) is determined depending on a response speed of a memory (ROM 104) in which programs are stored. Namely, in order to accurately read instructions from the memory, the execution time T of the IF stage must be set equal to or longer than the time required until the data on the data bus is established from change of the address on the address bus (the time required for memory access).
In addition, the time required for operations of each stage must be set equally in the pipeline process for parallel execution of all stages, and this operation time is determined in accordance with the stage which requires the longest process time. Accordingly, the CPU 102 has a problem that the processing speed of the CPU 102 (i.e. execution interval in the EXE stage) exceeding the processing speed of the ROM 104 cannot be attained. Namely, even when the processing speed of the CPU 102 is higher than the response speed of the ROM 104, the processing speed of the system which is configured with inclusion of the ROM 104 cannot exceed the response speed of the ROM 104.
On the other hand, as in the case of a computer system 101a illustrated in FIG. 6A, a high speed process may be assumed by realizing processing speed of a CPU 102a which is higher than the response speed of the ROM 104 by providing a wait control circuit 105 to adjust difference in the processing speeds of the CPU 102a and the ROM 104.
Namely, the wait control circuit 105 monitors control signals for the address bus 106 and the read and write operations not illustrated and makes the wait signal W active by as long as the predetermined number of clock cycles, upon detection of access to the ROM 104. The CPU 102a does not perform the update of stage only for the period in which the wait signal W is in the active state.
When an operation of reading the instruction from the ROM 104 in the IF stage is continuously executed, realization of high speed operation with the method described above is worthless, because the wait signal W is generated in every stage. However, in fact, a high speed operation realized with the method described above is useful when a branching instruction is executed or when an interrupt process routine is activated in accordance with the request for interrupt because the reading of instruction from the ROM 104 is interrupted for a certain period.
A practical example of such high speed operation will be described below.
FIG. 6B is a timing diagram illustrating the operations to execute the branching process in the n-th instruction process cycle. The stage execution time of the CPU 102a is expressed by (1/2)×T, the response time of the ROM 104 by T, and the time wherein the wait signal W is in the active state by (1/2)×T.
As illustrated in FIG. 6B, when the n-th instruction is read from the ROM 104 in the IF stage of the n-th instruction process cycle, since access to the ROM 104 is detected with the wait control circuit 105, the wait signal W becomes active during the period of (1/2)×T and the CPU 102a executes the IF stage for the period of T. Namely, since the response time of the ROM 104 becomes equal to the execution time in the IF stage, the CPU 102a can read the instructions from the ROM 104 without fail.
Next, the CPU 102a detects, in the DEC stage of the n-th instruction process cycle, that the instruction read from the ROM 104 is a branching instruction. Therefore, the CPU 102a then executes the pipeline control to stop execution of the IF stages of all instruction process cycles (at the (n+1)-th and (n+2)-th positions) which are located before the IF stage of the instruction process cycle to process the instruction of the branching destination (at the (n+3)-th position). However, since the IF stage of the (n+1)-th instruction process cycle is already executed, the pipeline control is performed to stop only the execution of the IF stage of the (n+2)-th instruction process cycle. Here, since the wait process is executed in the same manner as the IF stage of the n-th instruction process cycle in the IF stage of the (n+1)-th instruction process cycle, the (n+1)-th instruction is accurately read from the ROM 104 but this instruction is cancelled because this is not the instruction to be executed.
In the EXE stage of the subsequent n-th instruction process cycle, arithmetic operation of address is executed to obtain an address of the branching destination. In this timing, the DEC stage of the (n+1)-th instruction process cycle and the IF stage of the (n+2)-th instruction process cycle are not executed. Consequently, since access to the ROM 103 is not executed, the wait process is not executed and the process is executed within the ordinary stage process time (1/2)×T.
In the MA stage of the subsequent n-th instruction process cycle, since a value of a program counter which indicates the instruction reading address is updated to a value of the address of the branching destination calculated in the EXE stage, the instruction for branching to a destination is read from the ROM 104 in the IF stage of the (n+3)-th instruction process cycle. Here, the wait process is executed as in the case of the IF stage of the n-th instruction process cycle in the IF stage of the (n+3)-th instruction process cycle. Accordingly, the (n+3)-th instruction is read accurately from the ROM 104. Namely, the processing speed can be improved as much as the period of (1/2)×T in comparison with that of FIG. 5B in which the period up to completion of reading operation of the (n+3)-th instruction from the start of the reading operation of the n-th instruction is set to the period T in every stage.
FIG. 7 is a timing diagram illustrating operations to accept an interrupt request in the DEC stage of the n-th instruction process cycle. However, the ROM 104 is assumed herein to store an interrupt process routine programmed for every interrupt factor and an interrupt address table which is an aggregation of the head addresses of such interrupt process routines, and to search the interrupt address table in accordance with interrupt vector numbers read from an interrupt controller (not illustrated). Moreover, the interrupt vector number indicates an offset amount from the head address of the interrupt address table and is assigned in advance for each interrupt factor. That is, the head address of the interrupt process routine corresponding to the interrupt request generated can be acquired by adding the interrupt vector number to the head address value of the interrupt address table.
As illustrated in FIG. 7, when the interrupt request is accepted in the DEC stage of the n-th instruction process cycle, the instruction read in the IF stage of the n-th instruction process cycle is cancelled and the process related to the activation of the interrupt process routine is executed in the DEC and subsequent stages of the n-th instruction process cycle.
First, in the DEC stage of the n-th instruction process cycle, address information required to read the interrupt vector number from the interrupt controller (not illustrated) is prepared in substitution for the data prepared in accordance with the instruction read in the IF stage of the n-th instruction process cycle. Here, the IF stage of the (n+1)-th instruction process cycle is inherently unnecessary stage. However, since this IF stage is already processed when the interrupt request is accepted, the instruction is read from the ROM 104 but this instruction is cancelled without execution. In this case, the wait signal W is activated with the wait control circuit 105 and processed for the period of T. Subsequently, since the IF stages of the (n+2)-th to (n+6)-th instruction process cycles are not yet processed when the interrupt request is accepted, the instruction reading operations from the ROM 104 are never executed through operation control.
Next, in the EXE stage of the n-th instruction process cycle, the arithmetic operations are executed, when required, based on the address information prepared in the DEC stage of the n-th instruction process cycle to generate an address to read the instruction vector number from the interrupt controller (not illustrated).
Next, in the MA stage of the n-th instruction process cycle, the interrupt vector number is read, via the data bus 7, from the interrupt controller (not illustrated) in accordance with the address generated in the EXE stage of the n-th instruction process cycle. This interrupt vector number is then transferred to the DEC stage of the (n+2)-th instruction process cycle. Simultaneously, in the DEC stage of the (n+2)-th instruction process cycle, head address information of the interrupt address table is prepared.
Next, in the EXE stage of the (n+2)-th instruction process cycle, arithmetic operations of address for referring to the interrupt address table are executed based on the interrupt vector number transferred from the MA stage of the n-th instruction process cycle and the head address information of the predetermined interrupt address table.
Next, in the MA stage of the (n+2)-th instruction process cycle, data, namely the interrupt process routine start address corresponding to the interrupt request generated is read from the ROM 104 depending on the address calculated in the EXE stage of the (n+2)-th instruction process cycle and this start address is transferred to the DEC stage of the (n+4)-th instruction process cycle. Since the data is read from the ROM 104 in the MA stage of the (n+2)-th instruction process cycle, the wait signal w is activated with the wait control circuit 105 and is then processed for the period of T.
Next, in the EXE stage of the (n+4)-th instruction process cycle, the data transferred from the MA stage of the (n+2)-th instruction process cycle, namely the interrupt process routine start address corresponding to the interrupt request generated is transferred to the MA stage of the (n+4)-th instruction process cycle.
As the next step, in the MA stage of the (n+4)-th instruction process cycle, the data transferred from the EXE stage of the (n+4)-th instruction process cycle, namely the interrupt process routine start address corresponding to the interrupt request generated is set to a program counter and the top instruction of the interrupt process routine is read by the ROM 104 to start the IF stage of the (n+7)-th interrupt process cycle to execute the interrupt process routine corresponding to the interrupt requests generated in the subsequent stages.
Since the reading operation from the ROM 104 is executed in the IF stage of the (n+7)-th instruction process cycle, the wait signal W is activated with the wait control circuit 105 and is then processed for the period of T. That is, in comparison with the case wherein all stages are processed within the period of T, the time required until the reading operation of the (n+7)-th instruction is completed from the start of the reading operation of the n-th instruction can be reduced by as much as the period of 2×T(=(1/2)×T×4).
Advantages of the present invention described above are very distinctive in a microcomputer for controlling a built-in apparatus to execute the programs (namely, the programs including many branching instructions), which result in a large speed difference between the CPU and ROM, comprises many peripheral circuits to frequently generate the interrupt requests and also changes the processes in accordance with the situations or the like. However, the request for further high speed operations is still further increasing.