1. Field of the Invention
The present invention relates generally to improvements in pipelining in a microcomputer system and more specifically to a pipelined arrangement and of increasing a data processing time in certain stages of pipelining which is provided with instruction interrupt functions.
2. Description of the Prior Art
Pipelining is a hardware technique for achieving higher performance by breaking a complex, time-consuming function into a series of simpler, shorter operations, each of which can then be executed in an assembly-line fashion with simultaneous computations on different sets of data.
A time duration for data processing in each stage of a pipelined data processing system, is limited by delay induced by a pipeline register in each stage. The pipeline register is configured such as to temporarily store an instruction and/or data in response to an interrupt request which is issued in the case of a resource or data competition or conflict.
Before turning to the present invention it is deemed preferable to briefly discuss a known pipelined arrangement with reference to FIGS. 1-6.
FIG. 1 is a block diagram of a pipelined data processing system which includes five stages 1-5 in this particular case and to which the present invention is applicable. However, it should be noted that the present invention is in no way limited to such an arrangement.
Throughout the instant disclosure, hardware arrangements which are deemed irrelevant to the present invention will be omitted merely for the sake of brevity.
As shown in FIG. 1, the stage 1 includes a selector 10, a pipeline register 12, an instruction memory 14, and an adder 16. The stage 1 is to successively read out instructions previously stored in the instruction memory 14. The selector 10 is supplied with three inputs 18a-18c and selects one among them under the control of selector control signals RESET and INT. The control sinal RESET is to initially reset the selector 10, while the other control signal INT is applied to the selector 10 from an instruction decoder 20 of the stage 2 if an instruction interrupt occurs as will be discussed below.
Prior to initiating the operation of the pipelined arrangement of FIG. 1, an instruction set is retrieved from a main memory by way of an instruction bus (neither shown) and is then stored in the memory 14. In the case where an instruction interrupt does not occur, the selector 10 selects the input 18b (viz., output of the adder 16). A loop consisting of the blocks 10, 12 and 16 exhibits an address counter.
The stage 2 is provided to decode the instructions retrieved from the instruction memory 14 and, includes a selector 22, a pipeline register 24, and a register file 26 in addition to the above mentioned instruction decoder 20. The decoder 20 receives the output of the pipeline register 24 and applies successively the instructions decoded to the stage 3. If an instruction interrupt does not occur, the selector 22 selects the output (viz., instructions) retrieved from the memory 14. As is known in the art, the register file 26 is used to temporarily store instructions and/or data for future use.
The stage 3 is a stage for implementing arithmetic operations on the data derived from the register file 26. The stage 3 includes two selectors 28 and 30, two pipeline registers 32 and 34, and an arithmetic logic 36. If the instruction interrupt INT is not issued from the decoder 20, the selectors 28 and 30 select respectively the outputs of the decoder 20 and the register file 26.
The stage 4 includes two selectors 38 and 40, two pipeline registers 42 and 44, and a data memory 46. The stage 4 is a stage for accessing a data memory 46 for writing data thereinto and reading data therefrom. As in the above, if the instruction interrupt INT is not issued from the decoder 20, the selectors 38 and 40 select respectively the outputs of the blocks 36 and 34 of the preceding stage (viz., the stage 3).
Lastly, the stage 5 includes three selectors 48, 50 and 52, two pipeline registers 54 and 56, and a register file 26'. The selector 48 selects either of the outputs of the blocks 42 and 46 of the stage 4 in response to an instruction applied thereto from the pipeline register 44 of the stage 4. The register file 26' is usually configured in the same unit as the above mentioned register file 26 of the stage 2 and as such, the register file of the stage 5 is depicted by the same numeral as the register file of the stage 2 with a prime. If the decoder 20 does not issue the instruction interrupt INT, the selectors 50 and 52 respectively pick up the outputs of the preceding selector 48 and the pipeline register 44.
Reference is made to FIG. 2, wherein a timing chart is shown for discussing the operation of the pipelined arrangement of FIG. 1 when the instruction decoder 20 issues an instruction interrupt.
It is assumed that the instruction decoder 20 detects, at time slot (n+6), that an interrupt for one time slot (viz., freezing of instruction execution during the next time slot (n+7)) is necessary. Such an interrupt may be induced if a resource conflict (for example) with an instruction running in other pipelined arrangement (for example) occurs. The instruction decoder 20 issues an interrupt signal INT, during time slot (n+6), which is applied to the selectors of the stage 1 to the stage 5 (viz., selectors 10, 22, 28-30, 38-40, and 50-52).
The selector 10 is responsive to the interrupt signal INT issued at time slot (n+6) and selects the output of the pipeline register 12 (viz., input signal 18c) at the next time slot (n+7). Thus, the stage 1 holds the instruction No. 6 at time slot (n+7) in the instant case. In the similar manner, the selector 22 of the stage 2, in response to the interrupt signal INT, selects the output of the pipeline register 24 and thus, the stage 2 holds the instruction No. 5 at time slot (n+7).
In the stage 3, the selector 28 responds to the interrupt signal INT and selects the output of the pipeline register 32. This implies that the pipeline register 32 retains, at time slot (n+7), the same content as that at the previous time slot (n+6). Further, in the stage 3, the selector 30, in response to the interrupt signal INT, selects the output of the pipeline register 34 and accordingly, the instruction No. 4 which has been applied thereto at time slot (n+6) is retained at the following timing slot (n+7). It is understood from the foregoing that each of the stages 4 and 5 holds the preceding situation at time slot (n+7) as shown in FIG. 2.
FIG. 3 is a block diagram showing in detail the arrangement of the selector 10 and the pipeline register 12. As shown, the selector 10 is comprised of a plurality of selector logics 10(1)-19(n) (n is 34 (for example)) each of which is configured in exactly the same manner with one another and which selects one-bit signal from three one-bit signals applied thereto.
The pipeline register 12 of FIG. 3 includes a plurality of flip-flops 12(1)-(12(n) which are configured in exactly the same manner with one another and which are respectively coupled to the corresponding selector logics 10(1)-19(n). Each of the selecting logics 10(1)-10(n) normally (viz., when no interrupt signal is applied) selects one-bit signal forming part of the signal 18b issued from the adder 16. However, if the instruction decoder 20 (stage 2) issues the interrupt signal INT, each of the selecting elements 10(1)-10(n) selects the output of the associated flip-flop via a feedback loop (no numeral).
FIG. 4 is a block diagram showing in detail the flip-flop 12(1) together with the associated selector 10(1).
As shown in FIG. 4, the flip-flop 12(1) is comprised of four switches 60a-60b and 62a-62b, and four inverters 64a-64d. Each of the switches 60a-60b closes in response to a high logic level (for example) while opening in response to the reverse (viz., low) logic level. Contrarily, each of the switches 62a-62b closes and opens when a low and high logic levels are applied thereto, respectively.
That is, the switches 60a-60b and 62a-62b operate in a complementary fashion during each cycle of the clock. One bit signal which has been selected by the selector 10(1), is acquired via the switch 60a while it is closed. Following this, when the switches 60a and 62b are respectively rendered open and closed during the next half cycle, the acquired bit is retained in a loop denoted by L1. At the same time, the bit signal held in the loop L1 appears at the output of the flip-flop 12(1). Subsequently, when the next bit signal is acquired through the switch 60a during the first half of the next clock cycle, the bit signal already acquired is retained in a loop L2. These operations are well known in the art. Each of the other flip-flops 12(2)-12(n) of the pipeline register 24 is constructed in exactly the same manner as the above mentioned flip-flop 12(1).
FIG. 5 is a block diagram showing in detail the arrangement of the selector 22 and the pipeline register 24 (stage 2). As illustrated, the selector 22 is comprised of a plurality of selector logic 22(1)-22(n) each of which is configured in exactly the same manner with one another and which selects one-bit signal from two one-bit signals applied thereto. It will readily be appreciated that the selector 22 and the pipeline register 24 both of FIG. 5 are respectively configured in exactly the same manner as the counterparts 10 and 12 shown in FIG. 3. Accordingly, further descriptions thereof will be omitted for brevity.
Each of the other pairs of the selector and the pipeline register such as depicted by 28-32, 30-34 (both the stage 3), 38-42 and 40-44 (the stage 4), and 50-54 and 52-56 (the stage 5), is arranged as shown in FIG. 5.
FIG. 6 is a block diagram showing in detail the flip-flop 24(1) together with the associated selector 22(1). The flip-flop 24(1) is the same as the flip-flop 12(1) shown in FIG. 4 and thus, each of the circuit elements of the flip-flop 24(1) is labelled like numeral plus a prime. Further discussion of FIG. 6 is deemed redundant and accordingly will not be given here for simplifying the disclosure.
The prior art shown in FIGS. 1-6 has encountered the problem in that each of the pipeline registers in the stages 1-5 exhibits a relatively large amount of delay. The reason for this is that the pipeline register in question takes the form of a flip-flop for retaining an instruction or data for one or more time slots in an event of occurrence of an interrupt request. Therefore, if the time delay at some stages due to the pipeline register can be reduced, a time saved by reducing the delay is effectively used for data processing. By way of example, let it be assumed that one time slot is 5 ns and the delay is 1 ns. In this case, a time duration allocated to data processing is 4 ns. Therefore, if the delay can be shortened to 0.3 ns (for example) then the data processing time is extended to 4.7 ns.