The present invention relates to end-to-end signal synchronization between an encoder and a decoder.
Moving images and associated sound may be transmitted as audio/visual (A/V) signals and represented to viewers by receivers having video displays and audio speakers. A Set-Top Box (STB) represents an example device commonly used to receive such A/V signals. The STB, so called because this standalone signal converter is typically placed above the video display of a television receiver, may receive transmitted encoded A/V signals (sent by cable, broadcast or via satellite) and decode these signals for displaying on the television. For A/V signals to be enjoyed by a consumer without degradation in image frames or sound fidelity, the audio and video bitstreams must be synchronized based on standards used in the industry.
In order for two waveforms to be synchronized, their wavelengths and their phase must be matched. The time difference between corresponding events may be described as xe2x80x9cskewxe2x80x9d, and FIG. 1 illustrates an example. A first square-wave 10 has a first step rise event 12 corresponding to a particular time 14. The wavelength of the first square-wave provides a measure of its period 16. A second square-wave 18 has a second step rise event 20 a finite time interval later than the first step rise event 12, and this difference may be described as the skew 22.
The transmission of A/V data into a single bitstream as shown in FIG. 2 requires several processes. Video data 24 are input to a video encoder 26 yielding a video elementary stream 28 that is input to a video packetizer 30, producing a video packetized elementary stream (PES) 32. The corresponding audio data 34 are input to an audio encoder 36 to produce an audio elementary stream 38 that is input to an audio packetizer 40 producing an audio PES 42. The video PES 32 and audio PES 42 are input to a stream multiplexer (MUX) 44 from which a resulting bitstream 46 is transmitted. Upon reception, the bitstream must be separated into audio and video streams and synchronized for decoding.
Synchronization of an encoder and a decoder involves sending program clock reference (PCR) time stamps (or counts) embedded by the encoder in the A/V transport bitstream and received by the decoder on the STB. The PCR time stamps provide a sample of the encoder clock count sent in the transport stream packet. The encoder clocks drive a constantly running binary counter. The value of these counters is sampled periodically and placed in the header adaptation fields as the PCR. The decoder compares the received PCR time-stamps from the packet header to its corresponding time stamps from a local time counter (LTC), in order to synchronize the A/V presentation for decoding. The short-term history of the PCR increments relative to their LTC counterparts provides the relation between the local decoder""s clock and the encoder""s clock. The difference may represent skew or phase error. Typically, the decoder""s system clock is adjusted to match that of the encoder to avoid loss of A/V bitstream data.
The synchronization of system coding is defined by the International Standards Organization (ISO) in ISO-13818-1 titled Information Technologyxe2x80x94Generic Coding of Moving Pictures and Associated Audio Information, Part 1: Systems (November 1994). ISO-13818-1 specifies a multiplexed structure for combining audio and video data along with the representation of the timing information needed to replay synchronized sequences in real-time. Compression of the A/V bitstream is standardized by the Moving Photographers Expert Group (MPEG), e.g., standards such as MPEG-1, MPEG-2, MPEG-4 and MPEG-7. The target timebase frequency for the ISO-13181-1 system clock is 27 megahertz (MHz) with a variation of xc2x130 parts per million (ppm).
The PCR time stamp, used for the MPEG-2 transport standard, represents a small portion of a 188-byte packet as illustrated in FIG. 3 showing the packet 48 divided into a header 50 and a payload 52. The header is subdivided into several fields, in the first expanded row 54, including an adaptation field 56. Expansion of the adaptation field 56 yields a second expanded row 58, which includes an entry for optional fields 60. Expansion of the optional fields 60 yields a third expanded row 62, within which is contained the PCR 64.
The common synchronization method for coder/decoder (codec) end-to-end communication uses an external VCxO component to control the STB master clock frequency that establishes the encoder operating frequency, and to synchronize the decoder operating frequency so as to match the encoder operating frequency. The VCxO is an oscillator with a dynamic range connected to a voltage control input pin. The decoder can regulate its clock frequency by altering the VCxO control input voltage in response to the encoder clock PCR time stamps received by the decoder.
The codec clock comparison may be illustrated in FIG. 4 in which a remotely located video encoder 66 receives a input video signal 68 and outputs an elementary stream 70 that is combined with the frequency of the encoder clock 72 for the transport stream formation device 74. The bitstream 76 includes a first packet 78 and a second packet 80 separated by n bits of data 82 transmitted over the time between packets tn. PCR time stamps labeled xe2x80x9cXxe2x80x9d 84 and xe2x80x9cX+tnxe2x80x9d 86 are embedded within the first and second packets 78 and 80 respectively. At the local STB, the bitstream 76 is received by a transport stream decoder 88, which forwards the PCR 90 and the LTC 92 from the local clock 94 to the time-difference comparison device 96. After sending the time-difference to a low pass filter 98, the time-difference is sent to a variable oscillator 100 that uses the filtered time-difference to adjust the local clock 94 so as to be synchronized with the encoder clock 72.
The arrangement of synchronizing a local clock to a remote clock requires an external component with multiple-pin connections to be installed on the STB. The incorporation of a variable voltage oscillator at current prices on a STB having only modest sophistication represents an additional expense not commensurate to the value of the synchronization function and its application for STB signal decoding.
A method and apparatus for synchronization of an audio/visual bitstream is transmitted by an encoder and received by a decoder by employing duplication or elimination of audio samples and video pixels. The invention enables clock synchronization between the encoder and a decoder with an unregulated clock oscillator so as to control the data reader by skipping ahead (eliminating a data element) or to pause (duplicating a data element) depending on whether the encoder clock is faster or slower than the decoder clock.