In many electronic applications, signals are represented and processed digitally. Digital words, or samples, represent the value of the signal at a regular time interval. This regular interval is often referred to as the sample rate, and is typically expressed in units of kilohertz (kHz) representing the reciprocal of the sample interval time period. The signal thus represented can have no energy above half the sample rate; the frequency equal to half the sample rate is called the Nyquist frequency.
Practically, the signal thus represented must have its energy limited at some frequency below the Nyquist frequency. The band of frequencies which the signal is intended to contain is called the passband of the signal, and the upper frequency of the passband is called the passband edge. The band of frequencies between the passband edge and the Nyquist frequency is called the guardband. For example, for audio signals sampled at 48 kHz, the Nyquist frequency is 24 kHz, the passband edge is generally defined as 20 kHz, and the width of the guardband is 4 kHz.
There are situations when the available sample rate of the data is different from the desired sample rate. Depending on the characteristics of the sample data and how much the available and desired rates differ, several approaches may be used to convert the signal at one sample rate to a signal at another sample rate without substantially altering the meaning of the signal. A first common technique for sample rate reduction is called decimation. This technique is employed to reduce the sample rate of a signal by an integer divisor d. To reduce aliasing effects, any frequency components in the signal above the passband edge of the output sample rate are typically attenuated by applying a low pass filter to the incoming signal before reducing the sample rate. The rate reduction is performed by simply periodically discarding d−1 input samples of the signal, thus causing the output to consist of every dth filtered input sample.
A second common technique for sample rate conversion is known as up-sampling. This technique results in an increase of the sample rate of a signal by an integer factor u. The increased sample rate is achieved by augmenting the number of samples that represent the signal by inserting u−1 zero value samples between adjacent input samples. The resulting samples are typically filtered through an anti-imaging low pass filter at the higher output sample rate to remove frequency components above the original Nyquist frequency.
Typically, the decimation and up-sampling techniques are combined in a multi-stage converter to obtain a desired sample rate. The “classical” (prior art) multi-stage converter algorithm involves successive zero insertion, filtering, and decimating of a signal. For example, to change the sample rate of a signal sampled at rate R to a new rate R′=R*L/M, a number of zero value samples equal to L−1 are inserted between each sample of the signal (L and M are integers which can slowly change with time). This creates a signal at sample rate L*R. Lowpass filtering of the signal is applied that removes all frequencies above the lower of the Nyquist frequency associated with the original signal or the new Nyquist frequency associated with the signal at the converted sample rate. Then, the signal is decimated by the factor M by deleting all but every Mth sample, producing a new signal at rate R′.
The computational complexity of the aforementioned sample rate conversion techniques is a function of the filter employed, which when properly designed depends on the width of the guardband. The guardband width should determine the width of the filter's transition band, which is the band of frequencies between the passband edge and the lower edge of the filter's stopband. The lower edge of the lowpass filter's stopband is the frequency above which all frequencies are attenuated at least as much as the stopband attenuation specification. Simply stated, the higher the quality of the filtering and the more narrow the guardband, the more computationally intensive the converter. Important characteristics to consider when determining the quality of a filter include stop-band attenuation, passband ripple, and numerical accuracy. Passband ripple in a sample rate converter results from variation in gain of the filter that is expressed as a worst case deviation in decibels from nominal gain. A high quality sample rate converter for audio will generally have a passband ripple less than +0.01 dB. While such values of ripple are far below the limit of audibility, they are important if the signal may be passed through a sample rate converter multiple times, as might occur in a recording studio.
The level of distortion in a sample rate converter is a function of the magnitude of aliases typically produced during decimation. These result from the non-ideal stop-band attenuation of the filter in any real sample rate converter, which will cause some aliases to be produced by the decimation step. Aliasing may be measured by testing the converter with a full level sine waveform of varying frequency, noting the peak magnitude of any undesired components using the standard “peak THD+N” measurement technique. The ratio of the full level signal to the THD+N expressed in decibels, is the alias rejection of the converter at that signal frequency, which is directly related to the stopband attenuation of the filter and to the distortion of the converter. A high quality sample rate converter will have an alias rejection of more than 96 dB throughout the audio spectrum. It is worthy of note that the distortion caused by inadequate alias rejection is proportional to signal level. This is in marked contrast to quantization distortion or dither noise, both of which remain at a constant level regardless of signal level. Numerical accuracy of a filter depends intimately on the design details of the filter. In sample rate converters, the filter is generally designed so that the distortion resulting from numerical inaccuracy is substantially below the distortion generated by aliasing.
Classical Algorithm Single Stage Interpolation when Converting to Equal or Higher Sampling Rate with Equal or Wider Passband
FIGS. 1A, 1B and 1C illustrate filter requirements for a prior art classical algorithm single stage interpolator when converting a signal to an approximately equal or higher sampling rate with an equal or wider passband, while meeting certain quality specifications. FIG. 1A shows the spectrum of the input signal at sample rate FSX, including passband frequency range 101, guardband 102, first image 103, second image 104, third image 105, and fourth image 106. The images are redundant copies of the original passband frequency range whose production is inherent in the process of digital sampling. Images above the fourth are not shown. FIG. 1B shows the requirements for the filter of a single stage classical interpolator, including passband ripple 111, narrow transition band 112 with width less than twice the width of the guardband, beginning no lower than the upper edge of input signal passband 101 and ending below the bottom edge of the first passband image 103, and stopband 113 with finite attenuation. FIG. 1C shows the result of the interpolator filtering operation with passband 121 unchanged (apart from the effects of passband ripple) and all images attenuated by the stopband, thus preventing any significant aliasing when decimation occurs in the final interpolator operation.
One fundamental decision regarding performance of sample rate converters involves the interpretation of the guardband, the frequencies between the top of the passband and the Nyquist frequency. In particular, the choice must be made as to whether the filter stopband should begin at the Nyquist frequency, or at the first image of the passband edge. The latter case is less conservative, but requires approximately a factor of two less computation. The audible consequences of this choice are fairly subtle. If the less conservative choice is made, any energy present in the incoming signal above the input passband will result in aliases greater than the stopband limit, while the more conservative approach will not produce aliases under any conditions. When the conversion is to a lower sample rate, the additional aliases will always lie above the passband of the output sample rate. When the conversion is to a higher rate, they will lie well above the passband of the incoming rate. In both cases, if the lower of the passbands is assumed to be the limit of hearing, the aliases will be inaudible and hence of no consequence, which is discussed below with respect to FIGS. 2A-2G and 3A-3H.
Output Sample Rate Greater than Input Sample Rate
FIG. 2A shows the spectrum of the input signal at sample rate FSX, including passband frequency range 201, input guardband 202, first image 203, second image 204, third image 205, and fourth image 206. Images above the fourth are not shown. Also shown is a signal component 207 within the guardband, and that signal's first image 208, and higher images 209. FIG. 2B shows a less conservative interpolator filter having a transition band 212 of twice the width of the input guardband. FIG. 2C shows the result of the interpolator filtering operation shown in FIG. 2B with signal 207 and image 208 both partially attenuated as filtered signals 227 and 228 respectively. FIG. 2D shows the result of the decimation operation on the signal of 2C when the output sample rate FSY is above the input rate, showing the results of signal 207 and image 208 above the input passband as signal components 237 and 238. FIG. 2E shows a more conservative interpolator filter having a transition band 242 of the same width as that of the guardband. FIG. 2F shows the result of the interpolator filtering operation shown in FIG. 2E with signal 207 partially attenuated as filtered signal 257, but image 208 completely attenuated to the stopband limit. FIG. 2G shows the result of the decimation operation on the signal of 2F when the output sample rate FSY is above the input rate, showing the results of signal 207 above the input passband as signal component 267, but notably with image 208 absent.
Output Sample Rate Less than Input Sample Rate
FIG. 3A shows the spectrum of the input signal at sample rate FSX, including passband frequency range 301, input guardband 302, first image 303, second image 304, third image 305, and fourth image 306. Images above the fourth are not shown. FIG. 3B shows the output sample rate FSY which is lower than the input rate FSX and consequently has smaller output passband 311 and output guardband 312. FIG. 3C shows a less conservative interpolator filter having a transition band 322 of twice the width of the output guardband 312. FIG. 3D shows the result of the interpolator filtering operation of 3C with a portion of the input passband 301 partially attenuated as filtered frequency range 337. FIG. 3E shows the result of the decimation operation on the signal shown in FIG. 3D when the output sample rate Fsy is below the input rate, showing the results of signals in frequency range 337 as both signals 347 and aliases 348, both of which lie within the frequencies of the output guardband 312. FIG. 3F shows a more conservative interpolator filter having a transition band 352 of the same width as the output guardband 312. FIG. 3G shows the result of the interpolator filtering operation of 3F with a narrower portion of the input passband 301 partially attenuated as filtered frequency range 367. FIG. 3H shows the result of the decimation operation on the signal of 3G when the output sample rate FSY is below the input rate, showing the results of signals in frequency range 367 as signals 377 which lie within the frequencies of the output guardband 312, but with all aliases absent.
The performance measurement consequences of different guardband interpretations are more definite. If the converter is tested with input frequencies only below the lower of the passband limits, then either interpretation will meet the design specifications. However, if a converter going to a higher sample rate is tested with inputs containing energy in frequencies above the input passband limit, the aliases of these frequencies will exceed the stopband limit. If a converter going to a lower sample rate is tested with input frequencies above the output passband limit, and measured without a brick-wall filter at the output passband limit, then aliases above the stopband will be measured. In general, testing is limited to within the passbands, so the difference between the interpretations is generally not noted.
In the context of this disclosure, unless explicitly stated otherwise, it will be assumed that the guardband is interpreted in the less conservative manner, in which guardband aliases are acceptable. Thus for the case of the “classical” sample rate converter algorithm explained above, the filter is designed such that the transition band extends from the edge of the passband to the first image of the edge of the passband, thus having a width twice that of the guardband. In other words, for a comparable prior art “high quality” sample rate converter operating on audio at a sample rate of 48 kHz, the filter specifications would include passband ripple less than ±0.01 dB from 0 to 20 kHz, stopband rejection greater than 96 dB above 28 kHz, and a transition band of 8 kHz width from 20 to 28 kH, as previously described for FIG. 1.
Different Types of Sample Rate Converters
FIGS. 4A-4E illustrates the frequency domain behavior of a number of prior art, common, low complexity, sample rate converter techniques. FIG. 4A shows the spectrum of the input signal for any of these converters at a sample rate FSX including passband frequency range 401, input guardband 402, first image 403, second image 404, third image 405, and fourth image 406. Images above the fourth are not shown.
The simplest sample rate converter is called a “drop sample” interpolator, and uses a filter with a one sample wide rectangular impulse, which can be implemented with no computation at all. The complexity is very low because no arithmetic operations are required, and the quality is very poor. The frequency response of the filter associated with this sample rate converter is found by taking the magnitude of the Fourier transform of its impulse response, which is sin2(πf/fs)/(πf/fs), where f is the frequency and fs is the sample rate. This frequency response is shown in FIG. 4B, and includes passband ripple 411 of magnitude approximately 2.6 dB, stopband rejection 412 of magnitude 13 dB, and transition band 413 which extends (at a sample rate of 48 kHz) from the 20 kHz input signal passband edge to 39 kHz, thus having a width of 19 kHz.
The next simplest sample rate converter is a linear interpolator. The complexity is one multiply and two adds per output sample, and the quality is considerably better than drop sample. The impulse response of the associated filter, a triangular function two samples in width, has Fourier transform sin(πf/fs)/(πf/fs)2. The associated frequency response is shown in FIG. 4C, and includes passband ripple 421 of magnitude approximately 5.3 dB, stopband rejection 422 of magnitude 27 dB, and transition band 423 which extends (at a sample rate of 48 kHz) from the 20 kHz input signal passband edge to 39 kHz, thus having a width of 19 kHz.
These two classical algorithm sample rate converters are generally classified as low quality and low complexity. While quantitative distortion measurements of these two converters can be made, the results for broadband signals are so poor as to be nearly meaningless. A linear interpolator works fairly well when the input signal energy is concentrated at very low frequencies (< 1/10 of the Nyquist frequency).
Sample rate converters using the classical algorithm can be constructed from higher order mathematical interpolation techniques related to the drop sample and linear interpolators. One such family of interpolators are the splines, also called B-splines or (for the one of third order) cubic splines. The Fourier transform of the Nth order spline interpolator has been shown to be sinn(πf/fS)/(πf/fS)n, sometimes abbreviated sincn. FIG. 4D shows the frequency response of several members (orders 3, 4, 7, 15 and 31) of the spline family, including passband ripples 431, stopbands 432, and transition bands 433 which in all cases extend (at a sample rate of 48 kHz) from the 20 kHz input signal passband edge to approximately 39 kHz, thus all having a width of 19 kHz. It is interesting to note that the transition band width of a spline is fixed and approximately independent of its order, and also to note that while the stopband rejection of higher order splines is quite good, the passband ripple of a spline becomes increasingly poorer with increased order. Thus, high order splines are typically not useful for high quality sample rate conversion.
Another family of interpolators are polynomial interpolators, which are generally implemented according to the method of Lagrange, hence called Lagrangian interpolators. A closed form of the frequency response of the Nth order Lagrangian interpolator is too complex to reproduce here, but FIG. 4E shows the frequency response of several members of this family (orders 3, 4, 7, 15 and 31). Note passband ripples 441, stopbands 442, and transition bands 443. It is noteworthy that while the transition bands of higher order Lagrangian interpolators become narrower with increasing order, the stopband rejection remains quite poor. Thus higher order Lagrangian interpolators are not useful for high quality sample rate conversion.
Drop sample, linear, spline and Lagrangian interpolators have also been used in multistage systems for sample rate conversion. Because of the above mentioned limitations in filter quality for such interpolators, a high degree of oversampling (typically 128 times) must be used to achieve high quality. The computational complexity of the required oversampling is a major drawback to this approach.
Sample rate converters using the classical algorithm can also be constructed using FIR (Finite Impulse Response) filters. For converters where the sample rate ratio is fixed, polyphase FIR filters are generally used. For variable sample rate ratios, an FIR filter impulse response is typically stored in a table and interpolated, most commonly using linear interpolation.
The FIR filters used in such sample rate converters are generally designed to meet the requirements illustrated in FIGS. 1A-1C. When a windowed sinc function is used, the width of the transition band and the filter quality can be independently controlled, but in general there is no independent control of stopband rejection and passband ripple. When optimization methods, such as linear programming or Remez exchange are used, independent and precise control of passband ripple, stopband rejection, and transition band width can be accomplished, even to the extent of further dividing the bands to produce filters with more precisely controlled frequency responses.
Intermediate quality sample rate converters can be constructed with FIR filter orders from four to sixteen. These converters can be specified for distortion of wideband signals, although in general the results are substantially inferior to distortion measurements for other digital audio subsystems. An Nth order FIR sample rate converter, using linear interpolation of the FIR filter coefficients, will have a computational complexity of 2N multiplies (one for the convolution and one for the linear interpolation) and 3N additions (one for the convolution and two for the linear interpolation) per output sample.
High quality sample rate converters attempt to equal the performance of typical high fidelity digital audio subsystems. They typically involve FIR filters with orders from thirty-two to several hundred.
The filter associated with a sample rate converter may be either an infinite impulse response (IIR) filter or a finite impulse response (FIR) filter. Applying an IIR filter in the “classical” algorithm, is usually useful when efficiency is not of great concern and R/R′ (sample rate R/new sample rate R′) is a ratio of small integers. The primary disadvantage of the method is that the IIR filter is only efficient with a low rate LR, because the IIR filter is recursive. Thus, each output sample from the IIR filter depends computationally on the previous outputs, and thus must be computed at the LR rate. Also, it is not practical to vary R/R′ in real time because the restriction to small integer ratios (allowing LR to be small enough to be realizable) causes changes in rate to abruptly alter coefficients, producing audible defects.
Applying an FIR filter in the “classical” algorithm is much more practical than the IIR filter, because the FIR filter is not recursive. Output samples which are dropped during decimation need not be calculated. Similarly, multiplicative operations upon inserted zeroes in the incoming sample stream need never be performed. This implies that the computational complexity of this approach is independent of the value of L. In other words, a fixed number of multiply-add operations must be performed for each output sample. In general, despite the poorer efficiency of FIR filters in terms of computational steps to produce a desired filter specification, this approach is superior to the IIR approach except possibly when R/R′ is a small integer ratio.
As can be seen, the main drawback of high quality sample rate converters involves the mathematical complexity that typically results from providing high quality filtering.
What is needed, therefore, is a sample rate conversion technique of high quality and reduced computational complexity.