1. Field of the Invention
The invention relates to the processing of digital signals and, in particular, to methods and apparati for improving the performance of multi-stage digital signal processing having a finite precision.
2. Description of Background Art
Sophisticated algorithms for processing blocks of digital signals to improve system performance are widely applied in communications systems, image/sound/video processing systems, and storage systems. For example, in some communications systems, digital data is modulated and processed through several stages of interpolation/decimation finite/infinite impulse-response filters, e.g., at both the transmitter and the receiver. Several digital signal processing algorithms (or a sequence of such algorithms) are executed in multiple stages, where each stage employs finite word lengths for its input and output. The use of multiple stages provides increased efficiency in speed, power consumption, memory usage, and cost.
Various xDSL systems, such as asymmetric DSL, use discrete multi-tone (DMT) modulation to modulate digital data over a transmission medium, such as carrier lines. By applying an Inverse Fast Fourier Transform (IFFT) to the data signal, DMT provides for efficient frequency division multiplexing of the digital data. The output of the IFFT is further treated by multi-rate digital filters before the interface to an analog front-end. The analog front-end transforms the final digital data into a continuous-time waveform suitable for transmission over the transmission medium. At the receiver, the received continuous-time waveform is sampled and digitized. The digitized signal is further processed through several multi-rate digital filters and demodulated by a Fast Fourier Transform (FFT) before being passed to an estimation device. Typically, the IFFT and FFT are performed in multiple stages by algorithms such as a decimation-in-time (DIT) algorithm or a decimation-in-frequency (DIF) algorithm.
FIG. 1 illustrates a general representation of multi-stage processing of digital signals. The multi-stage processing comprises a series of stage processes 100, in which a set of digital data is transformed from a set of input samples 110 into a set of output samples 120. The N stage processes 100 typically include multiple stages of several constituent processing blocks (e.g., FFT, IFFT, multi-stage interpolation/decimation filters). The input samples 110 and output samples 120 comprise digital data expressed as a set of binary numbers.
One or more memory segments 130 may be provided for each stage process 100, coupling adjacent stage processes 100 so that the digital signal can be communicated from one stage to the next. FIG. 1 illustrates the flow of the digital signal as the data are stored in and then retrieved from each memory segment 130. For example, in the first stage 100(n−1), several samples from the input block may be arithmetically combined, multiplied with coefficients, or otherwise manipulated, and further combined to produce a block of output samples 120(n−1). This block of output samples 120(n−1) from the first stage serves as the input samples 110(n) to the second stage for similar processing, and so on. In this way, the output samples 120(n−1) of an (n−1)th stage are used as the input samples 110(n) of an nth stage, and the output samples 120 of the nth stage are used as the input samples 110(n+1) of an (n+1)th stage. For the last stage N, an (N+1)th memory segment 130(N) may also be provided for storing the output samples 120(N).
At each stage n, the output samples 120(n) are represented by a finite word width, b(n+1), which is determined by the word width allocated to each memory segment 130(n+1) of the (n+1)th stage. In many cases, the same memory can be used to store both the input and output of a stage, as is typically done in IFFT/FFT implementations. The output samples 120(n) of the nth stage are stored in the memory segments 130(n+1) for the (n+1)th stage. Accordingly, the word width b(n+1) in the memory segments 130(n+1) for the (n+1)th stage determines the word width allocated for storing the output samples of the nth stage. Moreover, this word width b(n+1) corresponds to the word width for the input samples of the (n+1)th stage.
The arithmetic operations in an arbitrary nth stage of processing could require a larger word width to represent the output samples in the same dynamic range than that allocated for the output samples. In conventional processing, when this happens, the most significant bits are retained at the output of the stage while the least significant bits are lost. This “rounding” of the binary numbers leads to a loss of precision. If the number of stages is large, as it is in a typical digital signal processing application, the aggregate effect of lost least significant bits could lead to a substantial error in the digital signal due to the finite precision in the digital signal processing. This error decreases the signal-to-noise ratio (SNR), which decreases the ability of the system to transmit data and, ultimately, decreases data transfer rates. Generally, a low SNR causes frequent errors necessitating retransmission of data, thereby decreasing the system efficiency and overall transfer rate.
A signal value can be represented digitally by a finite number of bits in various dynamic ranges. In two's complement notation, for example, four binary bits represent a signal value of “6” as “0110” in the dynamic range [−8,7], or “0011” in the dynamic range [−16,14]. However, the signal value of “6” cannot be represented in the dynamic range [−4,3.5] without losing a significant digit (i.e., the leftmost “1” other than the sign bit). In another example, using four bits and two's complement notation, the signal value “7” cannot be represented in the dynamic range [−16, 14] without a loss of precision. Representing the signal value “7” in this case requires that the value be rounded up to 8 (expressed as “0100”) or down to 6 (expressed as “0011”). It is thus apparent that precision decreases as the dynamic range in which a signal value is represented increases (assuming there are not enough bits to represent the specified dynamic range).
At a particular stage of the processing, the output samples of the arithmetic computations typically need to be stored in memory, which is often constrained to be a certain number of bits. The signal processors may have the ability to express the output samples with a greater number of bits than are allocated for the samples in memory, so often the output samples must be truncated before being stored in memory. A conventional approach is to represent the output signal in the largest dynamic range in which the output samples can possibly be, without considering the actual value of the output samples. From the signal processing perspective, as demonstrated above, it is desirable to represent the output signal value in the smallest dynamic range possible without losing significant digits. This maximizes precision and thereby increases the overall SNR in the system by minimizing the quantization noise resulting from the loss of least significant bits. For example, if the digital output at a particular stage of processing has five bits, b4b3b2b1b0, but only four bits can be stored in the memory, the conventional approach is to take bits b4b3b2b1 regardless of the value of the output signal. However, it is advantageous to keep bits b3b2b1b0, provided that there is no loss of significant digits—i.e., bit b4 is not significant, like a leading “0” bit or a sign extension.
What is needed therefore are techniques for increasing the precision of digital signal processing by preserving the least significant bits in the output samples of a multi-stage digital signal processing block having finite word widths, while avoiding the loss of most significant bits.