A number s has an arctangent denoted by .theta.=tan.sup.-1 (s) which is a real number in the interval [-.pi./2, .pi./2] satisfying the equation tan(.theta.)=s. The value .theta. is the angle between the x-axis and a line with slope s. A pair of numbers (x,y) define a point in the x-y plane. Associated with the pair of numbers is an arctangent denoted by .theta.=tan.sup.-1 (y,x) which is a real number in the interval (-.pi.,.pi.] satisfying the equations ##EQU1## whenever x and y are not both 0. Although there are older and more inefficient methods for finding an arctangent .theta.=tan.sup.1- (s) or an arctangent .theta.=tan.sup.-1 (y,x), the most recent and relevant prior art for finding the arctangent of a number comes from Shmuel Gal and Boris Bachelis, An Accurate Elementary Mathematical Library for the IEEE Floating Point Standard, ACM Transactions on Mathematical Software, 17, 1991, 26-45.
In order to understand the following methods and systems, a more detailed explanation of the prior art methods is needed. FIG. 1 (Prior Art) is a unit circle showing the different regions which are considered for numerical arguments in the teachings of Gal and Bachelis.
For calculating .theta.=tan.sup.-1 (s), different cases are used depending on whether the argument s lies in the region a, b, c or d in FIG. 1. Region "a" includes only arguments s which satisfy .vertline.s.vertline.&lt;2.sup.-4. In this region, the arctangent is computed using a polynomial approximation: EQU tan.sup.-1 (s)=s+c.sub.3 .multidot.s.sup.3 +c.sub.5 .multidot.s.sup.5 +. . . +c.sub.13 .multidot.s.sup.13. (1)
The relative size of the remaining terms to the first term is less than (2.sup.-4).sup.2, which sets the scale for the maximum error in Units in the Last Place ("ulp") of about 2.sup.-8 ulp.
Region "b" includes arguments s which satisfy 2.sup.-4 .ltoreq..vertline.s.vertline..ltoreq.1. The argument s is used to compute an index i=.left brkt-bot.2.sup.8 .multidot..vertline.s.vertline..right brkt-bot., where the brackets indicate truncation to an integer. The index "i" is used to select a reference value s.sub.i and a set of polynomial coefficients P.sub.j.sup.i and the arctangent is computed as: EQU tan.sup.-1 (s)=sign(s)(P.sub.0.sup.i +P.sub.1.sup.i .multidot.(.vertline.s.vertline.-s.sub.i)+. . . +P.sub.5.sup.i .multidot.(.vertline.s.vertline.-s.sub.i).sup.5). (2)
The value 2.sup.8 is chosen so that the relative size of the remaining terms to the first term is about 2.sup.-8 setting the scale for the maximum error of about 2.sup.-8 ulp. The reference values s.sub.i are chosen so that the leading term P.sub.0.sup.i is almost exactly the arctangent of s.sub.i.
Region "c" includes arguments for which 1&lt;.vertline.s.vertline..ltoreq.2.sup.4. In this region, a division is first used to compute 1/.vertline.s.vertline., which lies in the interval [2.sup.-4, 1) and the reults is computed using the mathmatical identity: ##EQU2## The arctangent on the right-hand side of EQ. (3) is computed using the method appropriate for region b but for argument 1/.vertline.s.vertline. instead of s. Some extra care is needed to preserve precision in the division and subtraction operations.
Region "d" includes arguments for which 2.sup.4 &lt;s. In this region, a division is first used to compute 1/.vertline.s.vertline., which lies in the interval [0,2.sup.-4) and the result is computed using the mathematical identity of EQ. (3). In this region, the arctangent on the right-hand side of EQ. (3) is computed using the method appropriate for region a with argument 1/.vertline.s.vertline. instead of s. Some extra care is needed to preserve precision in the division and subtraction operations.
There are disadvantages with Gal and Bachelis's method for computing the arctangent .theta.=tan.sup.-1 (s):
First, a lot of polynomials are needed in region "b". The reason for this is that the number of polynomials must be about 2.sup.8 in order that the scale for the maximum error should be about 2.sup.-8 ulp. This uses up a large amount of memory for a complete set of tables. FIG. 2 (Prior Art) is a unit circle illustrating the density of polynomials needed for this method. For each line extending from the center to the edge of the unit circle there are actually three different sets of polynomial coefficients required. Using this many coefficients is likely to severely impact performance of any cached memory system from which these coefficients are obtained.
Second, a division operation is needed in region "c". This is very disadvantageous because for many floating-point processors division is very much slower than multiplication or addition and may also prevent execution of other floating-point operations until the division is completed. There are probably two reasons for using the division. First, the factor of 2.sup.8 to obtain an index would require a prohibitive number of polynomials to achieve the desired level of accuracy. Second, it is not apparent that a polynomial expansion in an argument larger than 1 can converge rapidly enough to be useful.
For calculating .theta.=tan-1(y,x), different cases are used depending on whether the argument s lies in the region a, a', b, b', c or d in FIG. 1.
In region "a", the computation is performed using the relationship: ##EQU3## which requires a division. The value s on the right-hand side of EQ. (4) is the floating point value of y/x computed using by a division operation. In addition, an extra division operation is performed to compute a correction term .delta. which very nearly equals the difference between s and y/x. Without this correction term, the accuracy would be limited to about 1/2 ulp before rounding. The tan.sup.-1 (s+.delta.) appearing on the right-hand side is computed using EQ. (1). Only the linear term in .delta. from EQ. (1) is needed, but it must be added at the right time to avoid loss of significance.
In region "b", the computation is performed using the relationship: ##EQU4## which requires a division. The value s on the right-hand side of EQ. (5) is the floating point value of .vertline.y/x.vertline. computed using by a division operation. In addition, an extra division operation is performed to compute a correction term .delta. which very nearly equals the difference between s and .vertline.y/x.vertline.. Without this correction term, the accuracy would be limited to about 1/2 ulp before rounding. The tan.sup.-1 (s+.delta.) appearing on the right-hand side is computed using EQ. (2). Only the linear term in .delta. from EQ. (2) is needed, but it must be added at the right time to avoid loss of significance.
In region "c", the computation is performed using the relationship: ##EQU5## which requires a division. The value s on the right-hand side of EQ. (6) is the floating point value of .vertline.x/y.vertline. computed using by a division operation. In addition, an extra division operation is performed to compute a correction term .delta. which very nearly equals the difference between s and .vertline.x/y.vertline.. Without this correction term, the accuracy would be limited to about 1/2 ulp before rounding. The tan.sup.-1 (s+.delta.) appearing on the right-hand side is computed using EQ. (2). Only the linear term in .delta. from EQ. (2) is needed, but it must be added at the right time to avoid loss of significance. In addition, since the exact value of .pi./2 is not exactly representable as a single floating point number, it is approximated as the sum of two floating point numbers. Care is taken so that the additions are done in an order which preserves the accuracy of the result to about 2.sup.-8 ulp before the final rounding.
In region "d", the computation is performed using the relationship of EQ. (6) which requires a division. The value s on the right-hand side of EQ. (6) is the floating point value of .vertline.x/y.vertline. computed using by a division operation. In addition, an extra division operation is performed to compute a correction term .delta. which very nearly equals the difference between s and .vertline.x/y.vertline.. Without this correction term, the accuracy would be limited to about 1/2 ulp before rounding. The tan.sup.-1 (s+.delta.) appearing on the right-hand side is computed using EQ. (1). Only the linear term in .delta. from EQ. (1) is needed, but it must be added at the right time to avoid loss of significance. In addition, since the exact value of .pi./2 is not exactly representable as a single floating point number, it is approximated as the sum of two floating point numbers. Care is taken so that the additions are done in an order which preserves the accuracy of the result to about 2.sup.-8 ulp before the final rounding.
In region a', the computation is performed using the relationship: ##EQU6## which requires a division. The value s on the right-hand side of EQ. (7) is the floating point value of .vertline.y/x.vertline. computed using by a division operation. In addition, an extra division operation is performed to compute a correction term .delta. which very nearly equals the difference between s and .vertline.y/x.vertline.. Without this correction term, the accuracy would be limited to about 1/2 ulp before rounding. The tan.sup.-1 (s+.delta.) appearing on the right-hand side is computed using EQ. (1). Only the linear term in .delta. from EQ. (1) is needed, but it must be added at the right time to avoid loss of significance. In addition, since the exact value of .pi. is not exactly representable as a single floating point number, it is approximated as the sum of two floating point numbers. Care is taken so that the additions are done in an order which preserves the accuracy of the result to about 2.sup.-8 ulp before the final rounding.
In region b', the computation is performed using the relationship of EQ. (7) which requires a division. The value s on the right-hand side of EQ. (7) is the floating point value of .vertline.x/y.vertline. computed using by a division operation. In addition, an extra division operation is performed to compute a correction term .delta. which very nearly equals the difference between s and .vertline.x/y.vertline.. Without this correction term, the accuracy would be limited to about 1/2 ulp before rounding. The tan.sup.-1 (s+.delta.) appearing on the right-hand side is computed using EQ. (2). Only the linear term in .delta. from EQ. (2) is needed, but it must be added at the right time to avoid loss of significance. In addition, since the exact value of .pi. is not exactly representable as a single floating point number, it is approximated as the sum of two floating point numbers. Care is taken so that the additions are done in an order which preserves the accuracy of the result to about 2.sup.-8 ulp before the final rounding.
There are disadvantages with Gal and Bachelis's method for computing the arctangent .theta.=tan.sup.-1 (y, x):
First, a lot of polynomials are needed in region "b". The reason for this is that the number of polynomials must be about 2.sup.8 in order that the scale for the maximum error should be about 2.sup.-8 ulp. This uses up a large amount of memory for a complete set of tables. FIG. 2 illustrates the density of polynomials needed for this method. For each line from the center to the edge of the unit circle there are approximately four different sets of polynomial coefficients which are needed. Using this many coefficients is likely to severely impact performance of a cached memory system from which these coefficients are obtained.
Second, two division operations are used in each case. The first division is used to compute an approximation to either x/y or y/x. The second is used in computing the correction .delta.. This is disadvantageous because for many floating point processors division is very much slower than multiplication or addition and may also prevent execution of other floating point operations until the division is completed.