The present invention relates to floating-point arithmetic processing units such as are employed in digital computers, as well as in various specialized processors which require high speed arithmetic operations.
In digital computers and the like, where a large dynamic range and high precision are required, numbers are usually expressed in floating point (F.P.) representation. Numbers expressed in floating point include a mantissa and an exponent, and normally a sign. Thus, in a floating point environment, a real number X can be approximated by EQU X=m.sub.x r.sup.e ( 1)
where r is the radix, m is the signed (M+1) bit mantissa, and e is the signed (E+1) bit exponent. Digital computers normally operate on binary numbers, so the radix r=2. With a real number X expressed as above, the precision is on the order of 2.sup.-M, and the largest number is r.sup.p, where p=2.sup.E.
As noted above, floating point representation allows a large dynamic range and high precision, and thus is well suited to scientific calculations. There are of course a great many other applications, for example the processing of complex signal waveforms such as by Fourier analysis. There are, however, disadvantages. A particular disadvantage is that floating point arithmetic operations are relatively slow compared to their fixed point counterpoints. Moreover, floating point addition can cause a special problem, especially in real-time control applications where complex signal waveforms must be processed.
In particular, the time it takes to perform a floating point addition or subtraction can vary markedly depending upon the relative values of the data to be added or subtracted. Addition and subtraction conventionally require that the exponents be matched prior to the arithmetic process. Not only is this time consuming, but the time required for this matching process is, as noted just above, data dependent.
In addition, all four arithmetic operations require that the result be normalized by shifting the resultant mantissa until the radix point is properly positioned, and correspondingly adjusting the exponent. This, too, consumes time, which time is data dependent.
More specifically, floating point arithmetic is performed in stages. Multiplication (or division) is a three-step process which proceeds generally as follows:
(1) Multiply (or divide) the mantissas, and add (or subtract) the exponents.
(2) Post normalize the resultant mantissa, and then round.
(3) If required, adjust the exponent depending upon the normalization.
Floating point addition and subtraction are even more complex. Addition (or subtraction) is a four-step process generally as follows:
(1) Align the exponents, and shift mantissas accordingly.
(2) Add (or subtract) the mantissas.
(3) Post normalize the resultant mantissa, and round the result.
(4) If required, adjust the exponent depending upon the normalization required.
In present commercial floating point adder/subtractor units, up to one third of an arithmetic cycle can be consumed in an exponent alignment process. In addition, the length of time required to complete an exponent alignment is data dependent and, therefore, variable. In one particular machine (a Digital Equipment Corp. LSI-11), exponent alignment can take up to 17 microseconds, with a basic mantissa add time of 42 microseconds.
Thus, the development of efficient algorithms for processing floating point numbers is an area of continued interest.
Another disadvantage with floating point arithmetic operations is that the data flow paths through the arithmetic unit for multiplication/division and addition/subtraction are sufficiently different so as to demand (in most commercial realizations) two separate hardware units. As a result, the utilization rate of a hardware floating point unit can be as low as 50%.
As an alternative to floating point representation and arithmetic, the logarithmic number system (LNS) has been proposed and implemented to some extent. For example, the logarithmic number system and techniques for performing LNS arithmetic are described in E.E. Swartzlander, Jr. and A.G. Alexopoulos, "The Sign/Logarithm Number System", IEEE Transactions On Computers, Vol. C-24, December 1975, pages 1238-1242; and S. C. Lee and A. D. Edgar, "The Focus Number System", IEEE Transactions On Computers, Volume C-26, No. 11, November 1977, pages 1167-1170. Related to the above two articles are E. E. Swartzlander, Jr., "Comment On `The Focus Number System`", IEEE Transactions On Computers, Volume C-28, No. 9, September 1979, page 693; and S. C. Lee and A. D. Edgar, "Addendum to `The Focus Number Systems`", IEEE Transactions On Computers, Volume C-28, No. 9, September 1979, page 693.
In LNS, the mantissa is assigned a value of unity, and the exponent is given a fractional binary representation. For a given real X, the LNS representation is given by EQU X=+r.sup.e ( 2)
where e is a signed fractional number and r is, again, the radix. In a system having a known and unvarying radix r (e.g. where r=2), the exponent e alone completely represents the number X. In the nomenclature employed herein, for an input operand X the exponent representing the operand in LNS format is e'.sub.x ; for an input operand Y the exponent representing the operand in LNS format is e'.sub.y ; and the result S of an LNS arithmetic operation is represented in LNS format as e'.sub.s.
A significant advantage of the logarithmic number system is that arithmetic operations can be implemented so as to be very fast and, moreover, require a constant time regardless of the data.
Various techniques for performing arithmetic operations in LNS format are described in detail in the above-referenced literature references. However, for a better understanding of the present disclosure, LNS multiplication and addition are briefly summarized below.
LNS multiplication is nearly trivial, and requires only the addition of the exponents representing the two numbers to be multiplied. Thus, where the number X is represented in LNS format by the exponent e'.sub.x, and the number Y is represented in LNS format by the exponent e'.sub.y, the exponent e'.sub.s in LNS format representing the product of X and Y is the following: EQU e'.sub.s =e'.sub.x +e'.sub.y, for multiplication. (3)
As derived in the literature references identified above, addition (as well as subtraction) can be performed based on an extension of multiplication.
In particular, the LNS representation e'.sub.s of the product of two numbers X and Y represented in LNS format by the exponents e'.sub.x and e'.sub.y is as follows: EQU e'.sub.s =e'.sub.x +.theta.(e'.sub.x -e'.sub.y), for addition (4)
where the order of operands is arranged such that e'.sub.x .ltoreq.e'.sub.y, and the function .theta. is given by: EQU .theta.(e'.sub.x -e'.sub.y)=log.sub.r (1+r.sup.(e'.sub.y.sup.-e'.sub.x)). (5)
By letting EQU v=e'.sub.x -e'.sub.y, v.ltoreq.0 (6)
the above equation (5) can be simplified to: EQU .theta.(v)=log.sub.r (1+r.sup.-v) (7)
From the above Equations (4)-(7) it might at first appear that addition (and also subtraction) of numbers in LNS representation would be rather difficult, involving the calculation of base 2 (i.e. binary) logarithms and exponential functions. However, as is pointed out in the literature, in binary digital computer and specialized processor implementations, the function .theta. can be implemented quite simply by employing a look-up table in read-only memory (ROM). While calculation time is of course required to generate the look-up table in the first place, once it has been generated, the time during operation to determine the function value is simply the memory access time, typically expressed in nanoseconds. Thus, if the value v in Equation (7) is represented as an eight-bit number, then the look-up table need have only 2.sup.8 =256 entries. As another example, if the value v is expressed with twelve-bit precision, then a look-up table with 2.sup.12 =4096 entries is required.
Although LNS arithmetic is advantageous in terms of speed, which, significantly, is constant regardless of the data operated on, a significant disadvantage of LNS arithmetic is that the resolution or precision varies substantially throughout the range of absolute magnitudes. In short, the precision is a logarithmic function of the magnitude. Thus, the above-referenced E. E. Swartzlander, Jr. and A. G. Alexopoulos article entitled "The Sign/Logarithm Number System" emphasizes that "this system can not replace conventional arithmetic units in general purpose computers; rather it is intended to enhance the implementation of special purpose processors for specialized applications (e.g., pattern recognition, digital image enhancement, radar processing, speech filtering, etc.)." Similarly, the abovereferenced S. C. Lee and A. D. Edgar article entitled "The Focus Number System" emphasizes that available resolution is "focused" near zero. The LNS is described by Lee and Edgar as being particularly useful in digital control systems which should respond qualitatively strongly to gross errors between the output and the control signal, and quantitatively delicately as equilibrium is approached.
More recently, and directed also to the difficulty of maintaining a sufficiently high degree of precision over a wide dynamic range with LNS arithmetic, the present inventor has described a linear interpolation technique for extending the precision of an LNS arithmetic unit. See F. J. Taylor, "An Extended Precision Logarithmic Number System", IEEE Transactions On Acoustics, Speech, and Signal Processing, Vol. ASSP-31 Number 1, February 1983, pages 232-234.