Floating-point computation is important in many domains requiring a high degree of precision and dynamic range, including many embedded applications such as coefficient computation for digital subscriber line modems, graphics, and the like.
Floating-point numbers are stored in three parts: a sign, a mantissa and an exponent. A typical representation of a floating-point number is as follows:(−1)s*1.xxxx*2yyyy,where S is the sign, xxxx is the mantissa and yyyy is the exponent. The floating-point number is positive when S is 0 and negative when S is 1. The 1.xxxx is usually referred to as the “significand” of the floating-point number. The sign and significand together create a “sign-magnitude” representation. The position to the left of the decimal point in the significand is called the “integer” bit. The integer bit can be either explicitly included in a floating-point format or excluded. When the integer bit is excluded, it is called a “hidden” integer bit. For example, the Institute of Electrical and Electronics Engineers (IEEE) 754 floating-point standard defines Single-Precision and Double-Precision floating-point numbers having hidden integer bits. The size of the mantissa and the size of the exponent may vary depending on the type of precision used.
In performing floating-point operations, conventional floating-point units align the two operands. During alignment, the floating-point unit compares the exponents of the two operands and increases the smaller exponent such that it is equal to the larger exponent. In order to keep the smaller operand the same value, the floating-point unit also right-shifts the significand of the smaller operand. If the least significant bits of the significand that are shifted out are lost, information is lost. Therefore, conventional floating-point units store some of the bits that are shifted out in order to maintain precision.
Typically, three of these bits are stored, and they are known as the guard, round and sticky bits. The floating-point unit right-shifts data from the significand into the guard bit and the round bit. Thus, these bits are simply the two most recently shifted out bits. The sticky bit is the logical OR of all the bits that are less significant than the round bit.
Many techniques have been developed to calculate the sticky bit. For example, a conventional technique includes building a mask, building a selector of different size or results, building a trailing zero counter and comparing with the alignment counter. However, because the goal is generally to calculate the guard, round and sticky bits as quickly as possible regardless of implementation cost, conventional techniques fail to balance the speed and implementation complexity of the calculation, which is especially important for adding floating-point support to an integer processor pipeline and other similar types of applications.