1. Field of the Invention
This invention relates in general to logic circuits and computing systems, and more particularly to a device and method for a predictive comparator following addition.
2. Description of the Prior Art
Modern computing systems frequently are required to perform a rapid comparison following an addition. Some examples include in Arithmetic Logic Units (ALUs) where the output of an addition is often tested to see if it is equal to zero or if it equals to some number. See for example the circuit 100 illustrated in FIG. 1. The 32 bit register values X 102 and Y 104 are added by the adder 106 using a conventional carry propagate addition. The result 108 of the carry propagate addition is compared to the 32-bit register value Z 112 by the comparator 110 to determine if the result 108 equals the value of register Z 112. The output 114 of the comparator 110 indicates whether the two values are equal to each other. An example simulation of the circuit of FIG. 1, using a Verilog model, is shown in FIG. 3.
Furthermore, fast branch instructions in a high speed computing architecture provide a fast test if arithmetic logic unit addition equals zero (e.g., branch on zero) or if it equals a particular number (e.g., branch if reached a particular number value). The faster these instructions are performed the faster the overall computing system is capable of handling CPU intensive operations. The overall computing speed of a high speed computing system, in certain applications, may be significantly limited by how quickly the system can perform compare instructions following addition instructions.
Additionally, a key technique for improving the performance of a microprocessor, or in general any stored program machine, involves guessing the direction that a jump instruction takes, i.e., if the jump is taken or not taken. This is particularly important for pipelined computer architectures. These computing systems typically utilize fast compares following additions to predict branch addresses for jump instructions in a pipeline.
As is well understood by those of ordinary skill in the art, jump instructions constitute a significant portion, e.g., approximately 20%, of all executed instructions for a processor. If a jump instruction is taken, a processor must execute instructions from a new location in an instruction sequence. If the jump is not taken then the current flow of instructions continues.
A primary implementation technique used to achieve high clock rate in current processors is to deeply pipeline an architecture. This technique corresponds to breaking down the number of steps needed to execute an instruction into a large number of much smaller steps. Since these steps each perform a much smaller task than an unpipelined design, much higher clock rates become possible. The largest problem in such pipelined machines are the presence of jump instructions. If the jump is taken, and no attempt is made to predict it, then until the new instruction is available the pipeline must be frozen. This delay unfortunately decreases the performance of the stored program machine.
Generally a jump location, as is well known by those of ordinary skill in the art, is calculated by adding a number to the current instruction pointer. This is done with an adder. Subsequently, the result of the addition must be compared with the expected result. This comparison is provided by logic that allows the machine to guess the address of where the jump instruction will transfer the program counter to. If the two numbers are the same the jump has been correctly guessed, and the machine continues with its operation. Otherwise, the machine stalls and continues from the point of the jump instruction. The calculation involving a comparison, following an addition, is complex and often a critical path in the actual implementation of a design. Since the result of the operation determines the next task performed by the machine, it needs to be calculated quickly.
Accordingly, there exists a need for overcoming the disadvantages of the prior art as discussed above, and in particular to improve the processing speed of compare operations following addition operations in computing systems such as required for high speed and pipelined computing systems and for fast branch operations.
An approach to performing a fast comparison following an addition is proposed and demonstrated to show a significant reduction in delay as compared to a conventional implementation. Computer processing speed may increase by 45%. This is a significant improvement that enhances commercial viability of a fast computing system.
A preferred embodiment of the present invention does not require a carry propagate addition to be completed prior to a comparison being performed. The resultant new and novel solution has a smaller delay and requires less hardware than a conventional solution.
According to a preferred embodiment of the present invention, a full adder followed by XOR and AND logical operations replaces a conventional wide carry propagate addition followed by a compare operation. This improves computation speed by about 45%.
Fast branch instructions in high speed computing architecture, according to a preferred embodiment of the present invention, provide a fast test if an arithmetic logic unit addition equals zero (e.g., branch on zero), or if it equals a particular number (e.g., branch if reached a particular number value).
Pipelined computer architectures additionally benefit from fast compare following addition operations to predict branch addresses for jump instructions in the pipeline.