The present invention relates generally to computing and digital signal processing, and more particularly to pipelined logarithmic arithmetic in an arithmetic logic unit (ALU).
ALUs have traditionally been used to implement various arithmetic functions, such as addition, subtraction, multiplication, division, etc., on real and/or complex numbers. Conventional systems use either fixed-point or floating-point number ALUs. ALUs using real logarithmetic of limited precision are also known. For example, see “Digital filtering using logarithmic arithmetic” (N. G. Kingsbury and P. J. W. Rayner, Electron. Lett. (Jan. 28, 1971), Vol. 7, No. 2, pp. 56-58). “Arithmetic on the European Logarithmic Microprocessor” (J. N. Coleman, E. I. Chester, C. I. Softley and J. Kadlec, (July 2000) IEEE Trans. Comput., Vol. 49, No. 7, pp. 702-715) provides another example of a high precision (32-bit) logarithmetic unit for real numbers.
Fixed-point programming presents the programmer with the onus of mentally keeping track of the location of the decimal point, particularly after multiplication or division operations. For example, suppose an FIR filter involves weighted addition of signal samples using weighting factors of −0.607, 1.035, −0.607 . . . , which must be specified to 1 part in 1000 accuracy. In fixed-point arithmetic, it is necessary to represent 1.035 by 1035, for example. As a result, multiplication of a signal sample by this number expands the wordlength of the result by 10 bits. It order to store the result in the same memory wordlength, it is then necessary to discard 10 bits; however, whether it is the MSBs (most significant bits) or LSBs (least significant bits) or some of each that shall be discarded depends on the signal data spectrum, and so must be determined by simulation using realistic data. This makes verification of correct programming laborious.
Floating-point processors were introduced to circumvent the inconvenience of mentally keeping track of the point by automatically keeping track of the point with the aid of an “exponent” part associated with the “mantissa” part of each stored number. The IEEE standard floating-point format is:                SEEEEEEEE.MMMMMMMMMMMMMMMMMMMMMMM,where S is the sign of the value (0=+; 1=−), EEEEEEEE is the 8-bit exponent, and MMM . . . MM is the 23-bit mantissa. With the IEEE standard floating-point format, the 24th most significant bit of the mantissa is always 1 (except for true zero), and therefore omitted. In the IEEE format, the actual value of the mantissa is thus:        1.MMMMMMMMMMMMMMMMMMMMMMM.For example, the base-2 logarithmic number −1.40625×10−2=−1.8×2−7 may be represented by the IEEE standard format as:        1 01111000.11001100110011001100110.Further, the zero exponent is 01111111, and thus the number+1.0 may be written as:        0 01111111.00000000000000000000000.Representing true zero would require a negatively infinite exponent, which is not practical, so an artificial zero is created by interpreting the all zeros bit pattern to be true zero instead of 2−127.        
To multiply two floating-point numbers, the mantissas with their suppressed MSB 1's replaced, are multiplied using a fixed-point 24×24-bit multiplier, which is logic of moderately high complexity and delay, while the exponents are added and one of the offsets of 127 subtracted. The 48-bit result of multiplication must then be truncated to 24 bits and the most significant 1 deleted after left-justification. Multiplication is thus even more complicated for floating-point than for fixed-point numbers.
To add two floating-point numbers, their exponents must first be subtracted to see if their points are aligned. If the points are not aligned, the smaller number is selected to be right-shifted a number of binary places equal to the exponent difference to align the points before adding the mantissas, with their implied 1's replaced. To perform the shifting fast, a barrel shifter may be used, which is similar in structure and complexity to a fixed-point multiplier. After adding and more particularly subtracting, leading zeros must be left-shifted out of the mantissa while incrementing the exponent. Thus addition and subtraction are also complicated operations in floating-point arithmetic.
In purely linear format, additions and subtractions with fixed-point numbers are simple, while multiplications, divisions, squares, and square roots are more complicated. Multipliers are constructed as a sequence of “shift and conditionally add” circuits that have inherently a large number of logic delays. Fast processors may use pipelining to overcome this delay, but this typically complicates programming. It is therefore of interest to minimize the pipelining delay in a fast processor.
It should be noted that the floating-point number representation is a hybrid between logarithmic and linear representation. The exponent is the whole part of log to the base-2 of the number, while the mantissa is a linear fractional part. Because multiplication is complicated for linear representations and adds are complicated for logarithmic representations, this explains why both are complicated for the hybrid floating-point representations. To overcome this, some known systems, such as those cited above, have used a purely logarithmic representation. This solves the problem of keeping track of the point and simplifies multiplication, leaving only addition complicated. The logarithmic additions were performed in the prior art using look-up tables. However, limitations on the size of the tables restricted this solution to limited word length, for example to the 0-24 bit range. In the above reference to Coleman, 32-bit precision was achieved with reasonably sized look-up tables using an interpolation technique that requires a multiplier. As such, the Coleman process still includes the complexities associated with multiplication.
A different technique for extending precision while retaining reasonably sized look-up tables without requiring a multiplier was described for real arithmetic in U.S. Pat. No. 5,944,774 to current Applicant, which is hereby incorporated by reference herein. However, a method and apparatus for complex arithmetic, typically useful in radio signal processing, is required, as is a method with both real and complex processing capabilities, because both are usually required in common applications such as wireless communication devices. U.S. patent application Ser. No. 11/142,760 filed concurrently with this application addresses this problem, and is incorporated by reference herein in its entirety. Further, a method and apparatus that implements a multi-stage pipeline may be useful for increasing throughput speed while implementing complex and/or real arithmetic processes.