Processing devices such as digital signal processors, microprocessors, microcomputers, and micro-controllers include a plurality of elements such as memory, arithmetic logic unit (ALU), a clock, interrupt processing elements, and data buses that couple these elements together. The ALU performs arithmetic functions for the processing device. In particular, the ALU can perform addition, subtraction, multiplication, and logical operations such as AND, OR, NAND, etc. A typical ALU includes a clock reference, which is synchronized to the clock of the processing device, a control generation unit, control logic, a priority encoder, a shifting element, and an operand element. For most arithmetic functions, with the exclusion of a shift-left command, the primary encoder provides the arithmetic resultant for a particular set of opcode. While the arithmetic function is being executed, the ALU checks for overflow conditions. As is known, an overflow condition arises when the arithmetic resultant exceeds the bit size of a resultant register.
For arithmetic shift-left functions, the shifting element shifts the numerical resultant, or operand, by a predetermined shift amount. When the predetermined shift amount requires at least one bit of significance of the operand to be lost (i.e., shifted out), an overflow condition arises. To detect this overflow condition with prior art techniques, the output of the priority encoder (i.e., the numerical resultant) takes a full clock phase to become stable. This occurs because the priority encoder samples an input by first enabling a precharge device during the first clock phase of a clock cycle and then enabling a discharge device during a second clock phase of the clock cycle.
During the second clock phase, an overflow is sensed by an adder, which sends the overflow condition to the control generation unit. The control generation unit provides an overflow signal to a saturation register, which routes a saturation value to memory. It takes at least one clock phase to process the overflow condition and store the saturation information in the memory. Typically, it takes more than one clock phase to execute these steps due to parasitics within a processing device, which is implemented on an integrated circuit. Because of this, a user of the processing device must wait at least one additional clock cycle to use the resultant of the arithmetic shift-left function, thus adding unnecessary processing steps.
In some applications, the arithmetic shift-left operation is rarely used. Thus, waiting an extra clock cycle every so often does not greatly affect the overall execution time of the processing device. In other applications, the shift-left operation is used extensively, producing substantial delays in execution time. One such application is audio compression algorithms used in communication equipment. In audio compression algorithms, the shift-left operation is used extensively. By having to wait an extra clock cycle each time the shift-left operation is used, the audio compression capabilities are limited. For example, an audio compression algorithm that digitizes audio in to a 4.8 Kbit stream has more execution steps, and thus requires more execution time, than an audio compression algorithm that digitizes audio in to a 64 Kbit stream. In addition to adding extra execution time, which in the high audio compression algorithms cannot be afforded, waiting an extra clock cycle consumes additional power.
One solution to overcoming the arithmetic shift-left delay is to continuously load the output of the priority encoder such that the output can become stabilized within the first-clock phase and the overflow condition can begin in the first clock phase and be completed during the second clock phase, thus the overflow condition will be usable in the next clock cycle. While this technique eliminates the one clock cycle wait problem, it requires a substantial amount of current to maintain the load. For example, if the load is a self-biased sensed amplifier, 0.5 to 1.0 mA/bit is consumed. This is an impractical solution for battery operated communication devices.
Therefore, a need exists for a method and apparatus that eliminates the one clock cycle wait problem for overflow determinations and minimizing power consumption.