This invention relates generally to microprocessing, and more particularly to providing methods to improve fixed point arithmetic operations. Fixed point arithmetic may also be referred to as integer arithmetic, in that all operands and results are integers.
Divide instructions in general require many cycles to achieve the desired output precision specified by a computing architecture. As such, many different algorithms have emerged to take advantage of different dataflow architectures in order to increase performance and throughput of these “slow” instructions.
Most implementations of fixed point, i.e., integer, divide are based on the SRT (Sweeney Robertson Tocher) divide algorithm, which is similar to the repeated subtraction method often done by hand called “long division”. This method produces a fixed number of quotient bits each cycle, usually 1 or 2.
Some other implementations of integer divide are referred to as “iterative” algorithms. These may develop an early low precision estimate of the quotient and then iterate on that estimate until the required precision is achieved. In these schemes, each successive iteration results in a new value of the quotient with about double the precision of the previous iteration. Well-known iterative algorithms are the Newton-Raphson and Goldschmidt algorithms. Since intermediate results are inexact and require fractional values, iterative divide algorithms require that the data be represented as floating point numbers. Consequently, use of iterative algorithms for integer divide requires that the operands be converted to floating point values. When sufficient precision of the quotient is achieved, it is converted to the largest integer whose magnitude is less than or equal to the exact ratio of the dividend and divisor. This exact ratio may be referred to as the infinitely precise quotient. Because the intermediate quotient is inexact, it requires greater precision than that of the final result. For example, using decimal arithmetic, suppose that 600 is divided by 300, resulting in the value 2. Suppose also that after one iteration, the intermediate value may lie between 1.95 and 2.05. It might be tempting to just round this to 2. But suppose instead that 599 is divided by 300, or that 601 is divided by 301. For either case, the ratio is approximately 1.9967. Since the result must not be greater than this ratio, it must be rounded down to 1. Although the final quotient requires only one digit, in order to distinguish between the correct results for each case, the intermediate quotient needs at least four accurate digits, which is even greater than the number of digits in the dividend.
Variations of the SRT algorithms, on the other hand, may finish after obtaining just the required number of bits or digits of the quotient result. That is because the quotient digits that are developed at each step are exact, with a corresponding exact remainder.
Integer divide instructions must be capable of handling high precision operands and results, but “normal” workloads often work with operands of much less precision, thus requiring less precision in the result. For iterative algorithms, as seen in the example above, the intermediate result needs greater precision than that of the dividend. It is not dependent on the precision of the divisor.
In such instances, it would be advantageous to end the divide operation early when the required precision of the intermediate result is achieved. Early completion requires fewer iterations and would thus improve performance.