Shifting of a large number of bits is typically required in data format conversions, cordic approximations and denormalization operations. The term "sticky bit" is a term commonly associated with an IEEE standard for binary floating point arithmetic where a "sticky bit" is the result of a logical OR of any bits which are discarded as the result of a right shift operation of a data operand. Such a shift operation is commonly performed when aligning two operands for floating point addition or subtraction. Detection of any bits having a logic one value which are shifted off from the resulting operand is valuable information which can be used to improve the precision of an instruction commanding a floating point unit to add or subtract two operands in floating point format. In particular, the sticky bit is used to determine whether or not the resultant operand should be rounded up in order to retain precision. Previous floating point units have used a large bit-size data shifting circuit to perform a right shift operation on the smaller of two operands. Subsequent to the shifting, a microcode software sequence is executed by a floating point unit to determine whether or not any of the bits shifted away from the smaller operand had a logic one value, thereby detecting the existence of a sticky bit. The microcode sequence is a multiple step sequence which significantly slows the floating point unit and is therefore undesirable.
Shown in U.S. Pat. No. 4,864,527 issued to Peng et al. and entitled "Apparatus and Method for Using a Single Carry Chain For Leading One Detection and For `Sticky` Bit Calculation" is a hardware implementation of detecting a sticky bit in floating point processors. Logic circuitry is associated with each operand fraction position resulting in a significant propagation delay when large bit size operands are used. In addition, a large amount of logic circuitry and time is required to implement sticky bit detection when operands in the sixty-four bit range and greater are used and each bit position is implemented with its own circuitry in a serial architecture.