Many digital data processors, including most DSPs and multimedia processors, use binary fixed-point arithmetic, in which operations are performed on integers, fractions, or mixed numbers in unsigned or two's complement binary format. DSP and multimedia applications often require that the processor be configured to perform both saturating arithmetic and wrap-around arithmetic on binary numbers.
In saturating arithmetic, computation results that are too large to be represented in a specified number format are saturated to the most positive or most negative number. When a result is too large to represent, overflow occurs. For example, in a decimal number system with 3-digit unsigned numbers, the addition 733+444 produces a saturated result of 999, since the true result of 1177 cannot be represented with just three decimal digits. The saturated result, 999, corresponds to the most positive number that can be represented with three decimal digits. Saturation is useful because it reduces the errors that occur when results cannot be correctly represented, and it preserves sign information.
In wrap-around arithmetic, results that overflow are wrapped around, such that any digits that cannot fit into the specified number representation are simply discarded. For example, in a decimal number system with 3-digit unsigned numbers, the addition 733+444 produces a wrap-around result of 177. Since the true result of 1177 is too large to represent, the leading 1 is discarded and a result of 177 is produced. Wrap-around arithmetic is useful because, if the true final result of several wrap-around operations can be represented in the specified format, the final result will be correct, even if intermediate operations overflow.
As indicated above, saturating arithmetic and wrap-around arithmetic are often utilized in binary number systems. For example, in a two's complement fractional number system with 4-bit numbers, the two's complement addition 0.101+0.100 (0.625+0.500) produces a saturated result of 0.111 (0.875), which corresponds to the most positive two's complement number that can be represented with four bits. If wrap-around arithmetic is used, the two's complement addition 0.101+0.100 (0.625+0.500), produces the result 1.001 (−0.875).
Additional details regarding these and other conventional aspects of digital data processor arithmetic can be found in, for example, B. Parhami, “Computer Arithmetic: Algorithms and Hardware Designs,” Oxford University Press, New York, 2000 (ISBN 0-19-512583-5), which is incorporated by reference herein.
Many digital signal processing and multimedia applications require the functionality of both saturating arithmetic and wrap-around arithmetic. However, many conventional techniques are unable to provide an efficient mechanism for controllable selection of saturating or wrap-around arithmetic.
It may also be desirable in many applications to configure a given DSP, multimedia processor or other type of digital data processor for the computation of dot products. The dot product of two k-element vectorsX=[X[1], X[2], . . . , X[k−1], X[k]] and Y=[Y[1], Y[2], . . . , Y[k−1], Y[k]]is given byZ=X[1]*Y[1]+X[2]*Y[2]+ . . . +X[k−1]*Y[k−1]+X[k]*Y[k]. Thus, a k-element dot product requires k multiplications and (k−1) additions. Such dot products frequently occur in digital signal processing and multimedia applications.
By way of example, second and third generation cellular telephones that support GSM (Global System for Mobile communications) or EDGE (Enhanced Data rates for Global Evolution) standards make extensive use of dot products, usually with saturation after each addition and each multiplication. These standards generally require that the final results of a given dot product computation be identical (i.e., bit-exact) to the results that would be obtained when operations are performed serially, with saturating after each operation. Since saturating addition is not associative, the additions needed for the dot product are typically performed in series, which adversely impacts processor performance.
Another problem with conventional techniques for dot product computation and other vector operations is that such techniques are not readily adaptable for use in a pipelined processor. For example, certain conventional techniques may be difficult to extend to pipelines with more than two pipeline stages, since doing so will generally result in a substantial increase in the required circuit area.
Furthermore, the conventional techniques generally fail to provide a suitably efficient mechanism for supporting both the addition of operands to an accumulator value and the subtraction of operands from an accumulator value.
Accordingly, techniques are needed which can provide improved computation of dot products and other types of vector operations with either saturating or wrap-around arithmetic in a digital data processor.