1. Field of the Application
Generally, this application relates to the multimedia data processing. More specifically, it relates to methods and systems for sample rate conversion of multimedia data and sample clock synchronization.
2. Description of the Related Art
Many systems today use or interact with digital multimedia data. One common example is a device capable of playing digital audio music. Typical audio standards for such a device use sampling frequencies of 8, 11.025, 22.05, 44.1, 48, 96 and 192 kilohertz (kHz). These audio samples can be manipulated in a variety of ways. They can be created, stored, mixed or altered using a personal computer, or played using any of a multitude of play-back devices. Often, audio samples are created at one sample rate and played back at a different rate. Further, multiple audio samples created using various sampling rates can be mixed to produce a single audio sample for play-back at a single sampling rate. In these situations, one or more of the audio samples must be converted to the rate of another audio sample before mixing and/or playback can occur.
Generally, sample-rate conversion software or hardware is used for sample rate conversion. While many approaches to sample rate conversion are used today, satisfactory results are difficult to achieve. One reason for this difficulty is that the human ear is quite sensitive to slight distortions or discontinuities in audio samples. Coarse sample rate conversion produces noticeable distortion. Also, conventional sample rate conversion techniques utilize a frame-based sample rate converter that is best suited for fixed sample rate conversions, that is, from one known sample rate to another know sample rate. Use of this frame-based rate converter does not allow for phase corrections.
FIG. 1 illustrates using a first-in, first-out (FIFO) buffer and a phase detector to alter the sampling ratio of a sample rate converter as is known in the art today. FIFO 110 is written with an input audio sample for each pulse of the input clock 150, which operates at input frequency F0. Sample-rate converter 120 reads digital samples from FIFO 110 and outputs digital samples at the output frequency F1 in response to an output clock 170. Sample-rate converter 110 generates derived clock 160 from output clock 170 by multiplying the output clock by Q and dividing by P. Thus derived clock 160 has a derived frequency F2 of (Q/P)*F1. Q and P are chosen so that F2 is about the same as input frequency F0. Thus, FIFO 110 is read and written at about the same frequency. When Q/P is not exactly the same as the ratio of F0 to F1, FIFO 110 is read and written at slightly different rates. FIFO 110 can fill up or become empty. Samples can over-write earlier samples, or random or null data can be output as a sample. Thus simply using a FIFO can produce undesirable audio noise.
As further shown in FIG. 1, sample rate converter 120 is capable of varying the read sample rate to write sample rate ratio, Q/P, in response to adjust signal 130 from phase detector 140. Phase detector 140 compares the instantaneous phase and frequency of input clock 140 to derived clock 160 generated by sample rate converter 120. When the phase or frequency F0 varies from F2, phase detector 140 alters adjust signal 130. Sample rate converter 120 responds to adjust signal 130 by increasing or decreasing the ratio Q/P, thus altering derived clock 160. When derived clock 160 is adjusted sufficiently to match the phase and frequency of input clock 150, then adjust signal 130 stabilizes, causing sample rate converter 120 to stop adjusting derived clock 160. Changes in input clock 170 are thus tracked by sample rate converter 120 in a similar manner to a phase-locked loop (PLL).
However, phase detector 140 is typically a high-precision detector running at a high frequency, for example, at least 1000 times that of input frequency F0, so that phase changes of less than the clock period can be detected. Also, sample-rate converter 120 also needs a large memory for storing many sets of filter coefficients for the many possible ratios of Q/P.
A particular problem with achieving satisfactory sample rate conversion can occur when audio streams are synchronized to independent free-running clocks (i.e., sample rate conversion between a virtually infinite possibility of unknown frequencies). For example, the clocks for the two audio streams may be generated from two different crystal oscillators. Even if the frequencies of these two different crystal oscillators were supposed to be the same, no two oscillators are exactly identical. Slight differences can occur between the two crystals. The frequency difference may be up to 0.1% from nominal. Thus, for a 11025 Hz sample rate, the frequency can be as high as 11025+11.025 or 11036 Hz. When an audio signal that was synchronized to an 11036 Hz crystal oscillator is converted to an 11025 Hz rate, audio samples may be deleted after approximately every one thousand samples. The deleted audio samples can cause audible clicks or pops during sample play-back.
For audio broadcasts or streaming media applications, the clock from the broadcast or streaming source will not necessarily be synchronized to the local audio clock, which can produce instantaneous differences in frequencies, resulting in errors. Gradual accumulation of these errors can result in significant drift between the source and local clocks, which in turn can cause local buffer overflow and eventual breakdown of the broadcast or streaming environment. No elegant solutions exist in purely programmable, digital environments to compensate for this synchronization error.
Therefore, what is needed are methods and systems for universal sample rate conversion of multimedia data samples and sample clock synchronization that are equally applicable either in programmable and weakly-programmable digital signal processing environments.