1. Field of the Invention
The invention generally relates to microprocessors and, in particular, relates to a micro-code sequencer for use in a microprocessor.
2. Description of Related Art
Many complex instruction set computers (CISC) employ micro-code for facilitating the implementation of complex instruction sets. In a computer or microprocessor employing micro-code, most instructions are encoded within a micro-code ROM wherein each of the instructions is represented by a sequence of micro-vectors, each micro-vector having a unique address within the micro-code ROM. Typically, each instruction is represented by four to six micro-vectors, although longer micro-vector sequences are sometimes employed to represent highly complex instructions. In use, the micro-vectors for a corresponding instruction are fetched from the micro-code ROM, then executed by a data path logic unit. Hence, for each micro-vector, two steps are required--a fetch step and an execution step. With a single instruction requiring several micro-vectors and with each micro-vector requiring two clock cycles for fetch and execution, quite a number of clock cycles may be required to execute a single instruction. To improve overall throughput, the fetching and execution of micro-vectors is pipelined, whereby one micro-vector is executed while a second micro-vector is fetched. To this end, a latch is connected between the micro-code ROM and the data path logic unit to allow execution of one micro-vector, while a subsequent micro-vector is fetched, thereby achieving two-stage pipe-lining.
The pipelined execution of micro-code is controlled by a micro-code sequencer, a conventional example of which is illustrated in FIG. 1. Conventional micro-code sequencer 10 includes a micro-code ROM 12 for storing individual micro-code vectors, each having a unique micro-address. During use, individual micro-code vectors are output from ROM 12 along a path 13 to a latch 14, then output to a data path logic unit 16 along a path 15. Data path logic unit 15 executes the micro-code vectors received from latch 14. Output, latching, and execution of the micro-code vectors are synchronized by a clock signal provided on a clock line 18.
The micro-code vectors to be output from ROM 12 at each clock cycle are identified by a micro-instruction pointer 20. The micro-code vectors corresponding to a single instruction are stored in sequence in ROM 12 with each vector having a micro-address incrementally higher than the last. With such a configuration, a sequence of micro-vectors for a single instruction are identified merely by incrementing a starting micro-address with each clock cycle. The starting micro-address is supplied to micro-instruction pointer 20 along a line 21 from an instruction decoder (not shown) via a multiplexer (MUX) 24 connected to an input of micro-instruction pointer 20. An adder 20 is also connected to micro-instruction pointer 20 through MUX 24 for incrementing the micro-address stored within micro-instruction pointer 20 on each clock cycle, after the starting address has been received and processed.
However, many instructions include branch micro-vectors which define branch conditions. Depending on the result of the branch condition, the micro-code vectors to be fetched and executed may not have sequential micro-code addresses. To facilitate processing of branch conditions, multiplexer MUX 24 additionally receives a branch condition and a branch target address along lines 26 and 28. The branch condition and the branch target address are both specified within the branch micro-vector. The branch condition specifies certain flags, registers, or flip-flops (not shown) within data path logic unit 16 that are examined for branch resolution. The branch target address specifies the micro-address of the micro-vector that is to be fetched and executed if the branch is taken. The branch target address is received directly from an output of latch 14. The branch condition is also output from latch 14, but is processed by a branch resolver 29 prior to transmission to micro-instruction MUX 24. Branch resolver 29 also receives branch status flags directly from data path logic unit 16. If a branch is taken, as indicated by a branch signal received along line 31, multiplexer 24 transmits a branch target address received along line 28 to micro-instruction pointer 20. If no branch is taken, MUX 24 merely transmits the incremented micro-code address received from adder 22 to micro-instruction pointer 20.
Hence, upon each clock cycle, micro-instruction pointer 20 receives a micro-code address which is either a new starting address, a next sequentially incremented address or a branch target address. The micro-address is then transmitted to micro-code ROM 12 along lines 32. Micro-code ROM 12 responds by outputting, along path 13, the micro-vector corresponding to the received micro-code address. As noted above, the vector is latched by latch 14 and executed by data path logic 16. Execution proceeds until all vectors corresponding to the instruction are executed. Then, a new instruction is received along line 21 and the vectors for the new instruction are fetched and executed accordingly. With some implementations of micro-code, a branch condition defines a branch-taken stream of micro-vectors and a branch-not-taken stream of micro-vectors, with each stream comprised of two or more micro-vectors. The branch address received by MUX 24 along line 28 defines only the first address of the branch-taken stream. However, in many implementations of micro-code, each branch-not-taken vector stream includes only one micro-vector and the corresponding branch-taken-vector immediately follows the single branch-not-taken vector, i.e. each stream has a length of one micro-vector. In such an implementation, a branch target address need not be transmitted to MUX 24 from latch 14. Rather, the branch target address is always the next sequential address following the branch-not-taken vector, and that next sequential address is merely supplied by adder 22. In either case, the vector immediately following the branch must be squashed if the branch is taken.
As noted above, latch 14 is provided to achieve two-stage pipe-lining for enhancing the efficiency of the micro-code sequencer. The pipe-lining is illustrated in FIG. 2 which provides timing diagrams for the conventional micro-code sequencer of FIG. 1. In FIG. 2, sequence 40 illustrates five cycles of a clock signal. Sequence 42 illustrates micro-addresses N, N+1, N+2, and N+3. Sequence 44 illustrates micro-vectors A, B, C, and D corresponding to the micro-addresses 42. More precisely, sequence 44 illustrates the micro-vectors output from the micro-code ROM in response to micro-addresses 42. As can be seen, a new vector is latched at each clock cycle. Execution of the vectors is illustrated by sequence 46. Each vector is executed during a clock cycle immediately subsequent to the clock cycle when the vector was latched. Thus, during most clock cycles, one vector is latched while a second vector is executed.
Without branch conditions, the pipe-lining illustrated in FIG. 2 would achieve a high overall throughput, thereby lowering overall micro-instruction throughput. However, branch conditions can cause clock cycles to be wasted. In the example of FIG. 2, clock cycle 4 is wasted. Micro-vector B represents a branch condition which causes either vector C or vector D to be executed next, depending upon the resolution of the branch condition. However, vector C is automatically latched while the branch condition is being evaluated. Hence, if the branch condition indicates that vector D is to be executed next, then vector C must be squashed (as shown). Vector D must then be fetched and latched before execution can continue, resulting in the loss of one clock cycle.
Thus, FIGS. 1 and 2 illustrate a problem occurring within conventional micro-code sequencers wherein a clock cycle may be wasted upon the execution of the branch condition within the micro-code. Heretofore, no adequate solution has been proposed for remedying this problem.