1. Field of the Invention
The present invention relates to floating-point arithmetic, and more specifically to a floating-point unit that implements Newton-Raphson convergent algorithms for determining reciprocals and reciprocal square roots.
2. Related Art
The reciprocal of a number (N) is defined as 1 divided by N. The reciprocal square root of N is defined as 1 divided by the square root of N.
In digital processing systems, numerical data is typically expressed using an integer or a floating-point representation. A floating-point representation is preferred in many applications because of its ability to express a wide range of values and its ease of manipulation for some specified operations. A floating-point representation includes three components: a sign bit (sign), a mantissa (M) and an exponent (exp). The floating-point number represented is (−1)sign*M*2exp.
A standard code for representing floating-point numbers is the “IEEE Standard for Binary Floating-Point Arithmetic,” which is referred to herein as the IEEE-754 standard (or simply the IEEE standard) and incorporated herein by reference. In the IEEE standard the exponent consists of 8 bits for single precision floating-point numbers and 11 bits for double precision floating-point numbers, the mantissa consists of 23 bits for single precision and 52 bits for double precision. Additionally, for both single precision and double precision floating-point numbers there is a single bit that represents the sign of the number.
Many operations can be performed on floating-point numbers, including arithmetic operations, such as addition, subtraction, multiplication, division, and square roots. Because arithmetic operations with floating-point numbers require a great deal of computing power, many microprocessors come with a specialized integrated circuit, called a floating-point unit (FPU), for performing floating-point arithmetic. Floating-point units are also called math coprocessors, numeric coprocessors, coprocessors, or simply processors.
There are two major schemes utilized by an FPU in performing division and square root functions. The first scheme is “digit-by-digit”, and the second scheme is “convergent approximation”. In general, digit-by-digit schemes offer a short delay per iteration, but they also require a large number of iterations or clock-cycles. Additionally, digit-by-digit schemes produce a final remainder, which can be used to determine whether a result is exact or inexact.
Convergent approximations offer a faster convergence rate (lower number of iterations) than the digit-by-digit scheme, but they do not produce a final remainder; thus, one can not determine whether a result produced by a convergent approximation is exact or inexact. Nevertheless, convergent approximations provide approximate results that are adequate for most applications.
Convergent approximation algorithms for approximating reciprocals and reciprocal square roots include the Newton-Raphson algorithm. The Newton-Raphson algorithm for approximating the reciprocal of a number (N) is expressed as: Xi+1=Xi*(2−N*Xi), where Xi is an approximation of the reciprocal of N at the ith iteration, where i is greater than or equal to 1, and Xi+1 is a more accurate approximation. For example, if Xi provides 14 bits of accuracy, Xi+1 provides an accuracy of 28 bits.
The Newton-Raphson algorithm for approximating the reciprocal square root of N is expressed as: Xi+1=(3−N*Xi*Xi)*Xi/2, where Xi is an approximation of the reciprocal square root of N at the ith iteration, where i is greater than or equal to 1, and Xi+1 is a more accurate approximation.
What is desired is a method for implementing the Newton-Raphson approximations such that the accuracy of the approximations is increased, but that the performance of the system implementing the method is not negatively affected.