1. Field of the Invention
The present invention relates generally to calculators, computers, and arithmetic errors, and more particularly to apparatus useful for reducing errors in devices doing floating-point arithmetic.
2. Description of Related Art
Due to the finite size of storage locations and registers, computers either truncate or round all real numbers to a floating-point number having a mantissa length fixed by the computer. Subsequent manipulations of floating-point numbers often involve further truncation or rounding. For example, a computer using two digit mantissas can store 1.2 and 3.4 exactly, but the same computer must generally truncate or round the product which has three digits. (i.e., 1.2.times.3.4=4.08) to a value of either 4.0 or 4.1. The reduction of errors arising from the floating-point arithmetic is an important consideration in computer design and operation.
The relative error is generally related to the precision of the floating-point number. If c.sub.e is the error resulting from representing c by a floating-point number fl(c), c.sub.e =c-fl(c). The relative error associated with c, is defined by: EQU error.sub.R (c)=.vertline.c.sub.e /c.vertline.=.vertline.c-fl(c).vertline./.vertline.c.vertline..
If fl(c) has b binary digits, error.sub.R (c) is generally of the degree of 2.sup.-b-1.
The computer evaluation of a polynomial P.sub.m (x) can propagate floating-point errors. For example, precision may be lost, because a large number of factors appear in a term of P.sub.m (x). This loss of precision becomes more important as the degree m of P.sub.m (x) increases.
Even the computer evaluation of simple polynomials can result in floating-point errors. For example, consider P.sub.2 (x).ident.1.00+x+x.sup.2. For x=0.011, the exact result is that P.sub.2 (0.011)=1.100001. On a computer using three digit mantissas, an accurate evaluation should give the value fl(1.100001)=1.10. A prior art computer, with an arithmetic logic unit that rounds to three digits, performs a sequence of steps to evaluate P.sub.2 (0.011). First, the computer adds 1.00 to x. Since the arithmetic logic unit manipulates three digit mantissas, the addition normally starts by rounding 1.00 and x to the three most significant digits. After rounding, 1.00 and x remain unchanged. The adder sums 1.00 and 0.011 to obtain that 1+x=1.011 and rounds to three digits to obtain 1.10. Next, the computer stores x.sup.2 as fl(x.sup.2)=0.00101. Then, the computer adds fl(x.sup.2) to the previous and rounds that sum to three digits again, thus giving: EQU fl(1.10+0.001)=1.11.
Consequently, the computer result is that P.sub.2 (0.011)=1.11, i.e. the wrong result. The fact that the computer evaluation of such a simple polynomial can lead to incorrect results underlines the gravity of the problem with errors in floating-point arithmetic. As discussed above, the loss of precision due to floating-point error is potentially even more serious in polynomials of higher degree.
One prior art method to reduce errors associated with floating-point arithmetic involves employing multiple precision calculations. For example, double or quadruple precision eliminates error associated with the evaluation of 1.2.times.1.4 on computers employing two decimal digit mantissas. Nevertheless, multiple precision does not provide a complete solution to the occurrence of errors in floating-point arithmetic. For example, the use of double precision does not generally eliminate errors when the result is needed to double precision. Furthermore, multiple precision calculations generally require more computer resources and time than single precision calculations. Thus, alternate methods for reducing arithmetic errors are desirable.
The present invention is directed to overcoming, or at least reducing the effects of, one or more of the problems set forth above.