Sampling-rate conversion is the process of converting a discrete-time signal x[n] sampled at a rate FSin to another signal y[m] sampled at a rate FSout. If the ratio FSout/FSin is a constant then “synchronous” sampling-rate conversion may be used. A common example of sampling-rate conversion is when a portable audio player is required to accept content at a variety of input sampling-rates, FSin={32 kHz, 44.1 kHz, 48 kHz} and then must convert the stream to a fixed output rate of FSout=48 kHz. Mechanisms for the conversion of the sample rate of a digital signal to a different sample rate can be found in many modern digital signal processing implementations such as audio processors and digital modems. The mechanism can be implemented using a dedicated digital hardware module or as a software implementation, for example, using a digital signal processor.
A Synchronous Sample Rate Converter (SSRC) is used when the ratio, R, between the input sample rate FSin and the output sample rate FSout is fixed and known in advance. The SSRC is given this ratio, R, as a configuration parameter. Using this parameter, R, the SSRC processes the input signal and changes its sample rate from FSin to FSout 
In the FIG. 1A below, a block diagram of the interpolator for an example of a synchronous sample rate converter (SSRC 1102) is depicted.
The input signal 1015 is at the sample rate of FSin, the input clock frequency of the input clock signal 1015 is at FSin and the output clock signal 1030 is at a frequency of FSout. The Rate Ratio, R 1050 is a fixed configuration parameter that is programmed to the interpolator. Ideally R=2b×FSout/FSin, where b is the number of bits used to represent the fixed point number for R. The output signal 1025 y, is at the sampling rate of FSout. The output valid signal indicates a valid output signal, y. For example, a low to high transition of the valid output signal would indicate a valid output signal, y, and the rate of low to high transitions of this signal, would therefore be equal to FSout. The valid output signal may be a delayed version of the output clock.
A brief explanation of the operation, referring to FIG. 1A follows. Input samples of the input signal, x, are written to the input buffer 1101 of SSRC 1102 at the rate of FSin. The input clock transitions, low to high, indicate that a new input sample should be written to the buffer. On a low to high transition of the output clock the output sampling phase is updated such that phase(n)=phase(n−1)+R. The result is that phase(n)=n R. A low to high transition of the output clock indicates that a new output sample should be calculated and the control mechanism processes the current phase value, nR, and generates two control signals: input buffer “read control” signal and interpolation “filter control”. The input buffer “read control” signal determines the buffer entry from where samples are fetched to be used in the filtering operation, and the interpolation “filter control” is used to control the phase of the interpolator for proper filtering operation. The interpolation filter then filters the input samples with a proper delay shift, as per the control signal. The entire sequence of updating the phase, generating the filter and buffer read controls, the read operation of the samples from the buffer and the filtering operation occurs once on each low to high transition of the output clock.
It is desirable that the sample rate ratio, R=FSin/FSout is represented as a fixed point number however, if the number of bits used to represent the ratio are limited, this sample rate ratio R cannot be presented as a fixed point number without some loss of accuracy. For example, if FSin=44.1 kHz and FSout=48 kHz, the ratio R=0.91875. Representing this ratio R accurately as a fixed point number with 32 bits or less is not possible. If using b bits, the desired fixed point value for R is (2^b×R). So, for example, if b=26, then R=61656268.80. Therefore, if the number of bits is not sufficient, then some form of quantization must be introduced. The introduction of quantization to the sample rate ratio results in a phase that drifts away from the desired sample phase. As a result, if no special mechanism is implemented, this drift eventually will result in the SSRC sample's buffer overrunning or under-running.
To explain this problem we will first analyze a scenario where the ratio R is accurate and no drift is present in the system. Assume the elastic buffer depth is C samples/entries. Although there is no drift the phase will wander around its nominal value due to jitter that is present in the system. This jitter is mostly comprised of low frequencies and hence C should be made large enough so that the likelihood of the buffer pointer wandering more than C/2 buffer entries is negligible. Hence, having a “guard band” of C/2 entries for the buffer pointer to wander to the left or to the right side ensures that a buffer overrun or under-run is highly improbable. The actual value of C is determined by the statistical properties of the jitter of the particular use case and is beyond the scope of this discussion. The above mechanism has no direct effect on SNR.
A second analysis is provided for a scenario where the value of R is not accurate, and the buffer pointer drifts.
Let the ideal ratio,Ri=2b×FSin/FSout  (1)
And let the actual ratioR=Ri(1+α)  (2)
where α is the frequency offset
Hence,α=(R−Ri)/Ri  (3)
The buffer write operation is at the input sample rate, FSin and the read operation from the buffer is R/2b×FSout 
Now, from (2), and (1)R/2b×FSout=Ri(1+α)/2b×FSout=(1+α)FSin  (4)
The period between two consecutive buffer write operations, Tin, isTin=1/FSin So hence, the average period, Pave, between two consecutive buffer read operations is:Pave=1/[FSin(1+α)]=Tin(1+α)−1 
As α is very small,Pave=Tin(1−α)
The drift per input sample time, D, is the difference between the write period and the read period,D=Tin−Tin(1−α)=αTin
Hence, the time until the buffer pointer drifts one buffer entry from its nominal position (the middle of the buffer), isTin/|α|
Using the example of FSin=44.1 kHz and FSout=48 kHz, the ideal ratio is 0.91875. Representing this ratio as a fixed point number with 26 bits,Ri=226×0.91875=61656268.80Thus,R=61656269andα=(R−Ri)/Ri=0.003 ppm
Hence, the time until the buffer pointer drifts one buffer entry from its nominal position,Tin/|α|=1/(FSin|α|)=116 minutes
An Asynchronous Sample Rate Converter (ASRC) is used when the ratio, R, between the input sample rate FSin and the output sample rate FSout is not fixed and not known in advance. In this case it is likely that the sample rate ratio will vary over and be subject to change. FSin and FSout may be derived from difference clock sources and hence are asynchronous.
As shown in FIG. 1D, an ASRC is comprised of two major parts: a phase tracking unit and an interpolator.
The phase tracking unit processes the two clock signals, clock_in and clock_out, and calculates an accurate estimate of a frequency ratio Rc. The interpolator then uses this frequency ratio, Rc, to perform the actual sample rate conversion.
Referring to FIG. 1D:
clock_in is a clock signal at a frequency fclock—in that relates to the input signal sample rate
clock_out is a clock signal at a frequency fclock—out that relates to the output signal sample rate
FSin is the sample rate of the input signal
FSout is the sample rate of the output signal
The sample rate ratio, R=FSin/FSout 
Rc is the clock frequency ratio where Rc=fclock—in/fclock—out 
In the general case,fclock—in=2Nin·S1,in/S2,in·FSin
where Nin, S1,in, and S2,in are integers.fclock—out=2Nout·S1,out/S2,out·FSout
where Nout, S1,out, and S2,out are integers.
In practice therefore, the output, Rc of the phase tracking unit is:
                                 Rc          =                                    f              clock_in                        ⁢                          /                        ⁢                          f              clock_out                                                                    =                                    (                                                                    2                    Nin                                    ·                                      S                                          1                      ,                      in                                                                      ⁢                                  /                                ⁢                                                      S                                          2                      ,                      in                                                        ·                  FSin                                            )                        ⁢                          /                        ⁢                          (                                                                    2                    Nout                                    ·                                      S                                          1                      ,                      out                                                                      ⁢                                  /                                ⁢                                                      S                                          2                      ,                      out                                                        ·                  FSout                                            )                                                                    =                                                    2                                  (                                      Nin                    -                    Nout                                    )                                            ·                              (                                                      S                                          1                      ,                      in                                                        ·                                      S                                          2                      ,                      out                                                                      )                                      ⁢                          /                        ⁢                                          (                                                      S                                          2                      ,                      in                                                        ·                                      S                                          1                      ,                      out                                                                      )                            ·                              (                                  FSin                  ⁢                                      /                                    ⁢                  FSout                                )                                                                                  =                                                    2                N                            ·                              S                1                                      ⁢                          /                        ⁢                                          S                2                            ·                              (                                  FSin                  ⁢                                      /                                    ⁢                  FSout                                )                                                        where N, S1 and S2 are integers.
Hence,Rc=2N·S1/S2·R andR=2−N·S2/S1·Rc 
Therefore, in order to scale Rc a direct solution would be to multiply the value of Rc by 2−N·S2/S1. Whereas scaling by 2−N is straightforward, a simple shift operation, scaling by S2/S1 involves an integer ratio. It is desirable that the scaling factor, S2/S1, is represented as a fixed point number, however, if the number of bits used to represent the scaling factor are limited, this scaling factor S2/S1 cannot be presented as a fixed point number without some loss of accuracy. As a result, the calculated R is also inaccurate.
Yet for another example—referring to FIG. 1C—an application processor transfers an audio stream at 44.1 ksps to the interface IC which contains an asynchronous sample rate converter (ASRC). This 44.1 ksps stream is transferred over the SlimBus which uses a basic clock of 24,576 kHz. The SlimBus scheme allocates certain slots of its bus bandwidth in order to transfer the 44.1 ksps stream, but the clock is at 24,576 kHz. The ASRC inside the interface IC converts the 44.1 ksps stream to a 48 ksps stream which is sent over the I2C bus to the audio codec. FIG. 1B depicts the ASRC of this scenario.
As shown in FIG. 1D, the following relationships apply:R=FSin/FSout=44100/48000=0.91875fclock—in=24,576 kHz and fclock—out=48 kHzfclock—in=2Nin·S1,in/S2,in·FSin=2Nin·S1,in/S2,in·44100=24576000Hence,Rc=fclock—in/fclock—out=24576/48
The relationship between the available 24576 kHz clock and the desired 44.1 kHz clock is:
                                             f            clock_in                    =                                                    2                Nin                            ·                                                S                                      1                    ,                    in                                                  /                                  S                                      2                    ,                    in                                                              ·              44100                        =            24576000                                                        =                                                    2                9                            ·                              160                /                147.44100                                      =            24576000                              
Hence,Rc=2N·S1/S2·R 
As a result, in this example, scaling by a factor of 2−9·147/160 is required in order to use the correct rate ratio by the interpolator.
i.e.FSin=fclock—in·2−Nin·S2,in/S1,in 44100=24576000·2−9·147/160
As pointed out previously, scaling by 2−N is simple but to make a fixed point binary representation of a value Sf=S2,in/S1,in such as 147/160 without loss of accuracy requires an infinite number of bits. Note that for this specific example S2,in/S1,in=147/160 equals the fractional representation of the ideal ratio for R, i.e. 147/160=0.91875.
Representing this ratio accurately as a fixed point number without introducing a loss of accuracy would require an infinite number of bits and hence to make the fixed point presentation feasible, quantization must be introduced. As a result, the calculate sample rate ratio, R is also inaccurate. The introduction of inaccuracy to the sample rate ratio results in a difference between the desired phase and the actual phase and unless special mechanisms are introduced the phase drift will eventually produce an overrun or an under-run in the interpolator input buffer.
When the value of R is not accurate, there will be drift. This can be explained as follows:
Let the actual scaling factor beSf=Sfi(1+α)  (1)                where α is the frequency offset        and Sfi=S2/S1, the desired scaling factor without inaccuracy due to quantization.        
Let the ideal sample rate ratio,Ri=2−N·Sfi·Rc·=FSin/FSout  (2)
The actual ratio isR=2−N·Sf·Rc=2−N·Sfi·(1+α)·Rc=Ri(1+α)  (3)Hence,α=(R−Ri)/Ri  (4)
The buffer write operation is at the input sample rate, FSin and the read operation from the buffer is at a rate R×FSout 
Now, from (3), and (2)R×FSout=Ri(1+α)×FSout=(1+α)FSin  (5)
The period between two consecutive buffer write operations, Tin, isTin=1/FSin 
So hence, the average period, Pave, between two consecutive buffer read operations is:Pave=1/[FSin(1+α)]=Tin(1+α)−1 
As α is very small,Pave=Tin(1−α)
The drift per input sample time, D, is the difference between the write period and the read period,D=Tin−Tin(1−α)=αTin
Hence, the time until the buffer pointer drifts one buffer entry from its nominal position (the middle of the buffer), isTin/|α|
Using the example of Sf=147/160 the ideal scaling factor is 0.91875. Representing this ratio as a fixed point number with 26 bits,2b·Sfi=226×0.91875=61656268.80Thus,2b·Sf=61656269andα=(R−Ri)/Ri=(Sf−Sfi)/Sfi=0.003 ppm
Hence, the time until the buffer pointer drifts one buffer entry from its nominal position,Tin/|α|=1/(FSin|α|)=116 minutes
The drift rate, D, can be reduced and sometimes avoided if a sufficient number of bits are used for the fixed point representation of the sample ratio clock Rc. However, in a hardware implementation, increasing the number of bits will result in a higher area and increased power consumption and in a software implementation, increasing the number of bits above the processor word width, say 32 bit, would then require double precision arithmetic, resulting in higher MIPS requirement and increased power consumption. Another approach is to increase the buffer depth such that the time before the overrun occurs can be longer, as depicted in the FIG. 1A.
If the buffer size is C and there is no jitter and just drift is present, then the overrun or under-run will occur after a period of (Tin·C)/2|α|.
The drift rate, D, can be reduced or avoided if a sufficient number of bits are used for the fixed point representation of the sample ratio R. However, in a hardware implementation, increasing the number of bits will result in a higher area and increased power consumption and in a software implementation, increasing the number of bits above the processor word width, say 32 bit, would then require double precision arithmetic, resulting in higher MIPS requirement and increased power consumption. Another approach is to increase the buffer depth of the SSRC such that the time before the overrun occurs can be longer, as depicted in the FIG. 1A below. If the buffer size is C and there is no jitter and just drift is present, then the overrun or under-run will occur after a period of (Tin·C)/2|α|.
For the drift that occurs in the SSRC scenario drift avoidance can be achieved by implementing an asynchronous sample rate convertor (ASRC). In an asynchronous sample rate converter (ASRC) the input and the output clocks are derived from different sources. Effectively an ASRC 1008 adds a phase tracking unit 1102 to the SSRC 1102, as shown in FIG. 1B.
The phase tracking unit has two clock input signals, Fin and Fout: The first clock signal, Fin is at the input sample rate, FSin, and the other input signal, Fout, is the output sample rate, FSout. The phase tracking unit output is a value representing the sample rate ratio R in a fixed point format. In this case, however, the value of R is not fixed but fluctuates between several values in a manner that ensures that the average exactly matches the desired sample rate ratio. In this way the drift is completely avoided. A sufficient number of bits is selected for the fixed point representation of R to ensure that the SNR degradation due to jitter resulting from the fluctuations is negligible. In a typical implementation, the number of bits that is required to ensure negligible jitter is much smaller than the number of bits required for decreasing the time for the buffer to overrun or under-run. However, adding the tracking unit to the SSRC increases the complexity, area and power consumption of the rate convertor. In addition to that, before actual sample rate conversion can start, the tracking unit must first lock onto the frequency ratio. Note also that, in certain cases, the actual clocks representing the sample rate rates are not readily available to be processed.
A full duplex ASRC performs the conversion in both directions simultaneously. FIG. 1E below is a representation of a full duplex ASRC.
In the case of bi-directional traffic, there are two ASRCs 1008 and 1009—a first ASRC 1008 that converting signal FS A to signal FS B, and the second ASRC 1009 for converting signal FS B to signal FS A with the corresponding clock frequency ratios being Rc and 1/Rc respectively. The conventional method would be to simply use two phase tracking blocks as shown in FIG. 1E. A scheme is proposed that uses just one phase tracking block with two interpolators.