1. Field of the Invention
The present invention is directed to communications systems in general, and more particularly, to methods and apparatus for correcting time stamps embedded in data streams to be carried in an Asynchronous Transfer Mode (ATM) network. The present invention is particularly useful for adjusting Program Clock References (PCR's) in an MPEG-2 Transport Stream to account for jitter introduced when the Transport Stream is transmitted across an ATM network.
2. Description of the Prior Art
Recently, the International Organization for Standardization (ISO) adopted a standard protocol for combining one or more "elementary streams" of coded video, audio or other data into a single bit stream suitable for transmission. Referred to as the MPEG-2 (ISO 13818) standard, the standard is composed of four parts: Video, Audio, Systems and Compliance. The Systems part of the standard is described in detail in the MPEG-2 Systems Committee Draft (ISO/IEC JTC1/SC29/WG11/N0601, November, 1993) [hereinafter "MPEG-2 Systems Committee Draft"], which is hereby incorporated by reference. An overview of the MPEG-2 Systems standard is provided in Wasilewski, The MPEG-2 Systems Specification: Blueprint for Network Interoperability (Jan. 3, 1994), which is also hereby incorporated by reference. The MPEG-2 Systems standard provides a syntax and set of semantic rules for the construction of bitstreams containing a multiplexed combination of one or more "programs." A "program" is composed of one or more related elementary streams. An "elementary stream" is the coded representation of a single video, audio or other data stream that shares the common timebase of the program of which it is a member. For example, in the context of a subscription television system, a "program" may comprise a network television broadcast consisting of two elementary streams: a video stream and an audio stream.
As defined in the MPEG-2 Systems standard, an elementary stream, whether video, audio or some other type of data, contains a continuous stream of "access units". An access unit is the coded representation of a "presentation unit." For video elementary streams, the presentation unit is a picture, and a corresponding access unit for that picture includes all the coded (e.g., compressed) data for that picture. The presentation unit for audio elementary streams is defined as the set of digital audio samples in a single "audio frame" An access unit for a given audio frame will include all the coded (e.g., compressed) data for that audio frame.
According to the MPEG-2 Systems standard, each elementary stream, i.e., the sequence of access units for one video, audio or other data stream, is packetized to form a Packetized Elementary Stream (PES). Each PES packet in a given Packetized Elementary Stream consists of a PES packet header followed by a payload containing one or more access units of that elementary stream. The Packetized Elementary Stream structure provides a means for packaging subparts (i.e., one or more access units) of a longer elementary stream into consecutive packets along with associated indicators and overhead information used to synchronize the presentation of that elementary stream with other, related elementary streams (e.g., elementary streams of the same program). Each Packetized Elementary Stream is assigned a unique Packet ID (PID). For example, the Packetized Elementary Stream containing the coded video data for a network television program may be assigned a PID of "10"; the Packetized Elementary Stream containing the associated audio data for that program may be assigned a PID of "23", and so on.
Further in accordance with the MPEG-2 Systems standard, one or more Packetized Elementary Streams may be further segmented or "packetized" to facilitate combining those streams into a single bitstream for transmission over some medium. The MPEG-2 Systems Committee Draft specifies two different protocols for combining one or more Packetized Elementary Streams into a single bitstream: 1) the Program Stream (PS) protocol and 2) the Transport Stream protocol. Both stream protocols are packet-based and fall into the category of "transport layer" entities, as defined by the ISO Open System Interconnection (OSI) reference model. Program Streams utilize variable-length packets and are intended for "error-free" environments in which software parsing is desired. Program Stream packets are generally relatively large (1K to 2K bytes). Transport Streams utilize fixed length packets and are intended for transmission in noisy or errored environments. Each Transport Stream packet comprises a header portion and a payload portion. Transport Stream packets have a relatively short length of 188 bytes and include features for enhanced error resiliency and packet loss detection. As will become evident hereinafter, the methods and apparatus of the present invention are particularly well suited for use in the transmission of an MPEG-2 Transport Stream through an ATM network, and therefore, the remaining discussion will focus on MPEG-2 Transport Streams. It is understood, however, that the methods and apparatus of the present invention is by no means limited thereto.
The MPEG-2 Transport Stream specification provides a standard format (i.e., syntax and set of semantic rules) for combining one or more Packetized Elementary Streams into a single "Transport Stream" that may then be transmitted over some medium. FIG. 1 graphically illustrates the generation of an MPEG-2 Transport Stream from a plurality of Packetized Elementary Streams. Generation of an MPEG-2 Transport Stream begins by segmenting each Packetized Elementary Stream and inserting successive segments into the payload sections of successive Transport Stream Packets. For example, as illustrated in FIG. 1, one of the PES packets 10 of the Packetized Elementary Stream containing the coded video of elementary stream "Video 1", is segmented and inserted into the payload sections of two consecutive Transport Packets 12 and 14. Every Transport Packet has a header, e.g., header 16 of Transport Packet 12, and the header of each Transport Packet contains the PID associated with the Packetized Elementary Stream carried in that Transport Packet. In the example illustrated in FIG. 1, the Packetized Elementary Stream carrying the coded video of elementary stream "Video 1" has been assigned a PID of `10`, and therefore, the header of each Transport Packet 12, 14 carrying the data of that Packetized Elementary Stream will contain a PID value of `10`. Similarly, the headers of each Transport Packet 18, 20 carrying the Packetized Elementary Stream data for elementary stream "Audio 1" will contain the PID assigned to that elementary stream, which in the example shown is `23`. As each Packetized Elementary Stream is segmented and inserted into respective Transport Packets, those packets are fed to a Transport Stream multiplexer 22 that multiplexes the packets to form a single bitstream, referred to as a "Transport Stream" Thus, a Transport Stream comprises a continuous sequence of Transport Packets, each of which may carry data from one of a plurality of Packetized Elementary Streams. At a decoder location, a given Packetized Elementary Stream can be recovered from the incoming Transport Stream by simply extracting every incoming packet whose header contains the PID assigned to that Packetized Elementary Stream.
Further according to the MPEG-2 Systems standard, generation of Transport Packets for each Packetized Elementary Stream is carried out by an encoder employing a common system clock. Decoders for receiving and presenting a selected program (i.e., a set of related elementary streams) must have a system clock whose frequency of operation and absolute instantaneous value match those of the encoder. However, in practice, a decoder's free-running system clock frequency will not match the encoder's system clock frequency exactly, and therefore, some method for synchronizing the decoder system clock with the encoder system clock is required. In the MPEG-2 Systems standard, synchronization of a decoder's system clock with the encoder's system clock is achieved through the use of timestamps, referred to in the MPEG-2 Systems Committee Draft as Program Clock References (PCRs). A PCR is an actual sample (i.e., timestamp) of the encoder's system clock. For each program carried in a given Transport Stream, PCR's must be generated at least once every 100 ms and inserted into the Transport Packets carrying one of the elementary streams that make-up that program. For programs comprised of a video elementary stream and an audio elementary stream, PCR's are typically generated and inserted into the Transport Packets that carry the Packetized Elementary Stream data for the video elementary stream. In the example of FIG. 1, one PCR 24 was generated during the creation of Transport Packet 12 and another PCR was generated during the creation of Transport Packet 14, each of which carry PES data for the video elementary stream "Video 1". Similarly, a PCR 28 was generated during the creation of Transport Packet 32 which carries PES data for the video elementary stream "Video 21". Each PCR is an actual sample of the encoder system clock at the time the PCR was generated and inserted into its respective Transport Packet.
As can be appreciated, as the Transport Packets for each elementary stream reach the Transport Stream multiplexer 22, certain packets will experience some delay since the multiplexer can only "send" one packet at a time. When a PCR bearing Transport Packet is delayed, the original PCR in that packet is no longer valid. Consequently, the transport stream multiplexer 22 must "adjust" the original PCR to account for any delay imposed on that packet by the multiplexer. Essentially, the multiplexer simply determines the amount of delay the packet experienced between input and output, and then adds that delay time to the original PCR value as the packet leaves the multiplexer in the outgoing Transport Stream. As a result of this adjustment, the PCR's of a given program, no matter where they may appear in an incoming Transport Stream, should reflect the absolute value of the encoder's system clock at the time the packets bearing those PCR's were inserted into the outgoing Transport Stream at the encoder.
At a reception site, a decoder can use the transmitted PCR's to "slave" its system clock to the encoder's system clock. Decoders allow recipients of a Transport Stream to select one of the "programs" carried in the Transport Stream for output or presentation at a reception site. For example, in a subscription television system, wherein each program may represent a different television broadcast, a subscriber may employ a decoder to select one of those programs for viewing on a television set. A television program will typically comprise a video elementary stream and an audio elementary stream.
FIG. 2 is a block diagram of an exemplary decoder 40 that includes a clock generation circuit 58 capable of slaving the decoder system clock to the encoder's system clock. As shown, an MPEG-2 Transport Stream is received by the decoder 40 and provided to a Transport Stream de-multiplexer/parsing unit 42. A user's program selection is provided to the demultiplexer 42 via line 44. As described in greater detail in the MPEG-2 Systems Working Draft, information carried in certain system related Transport Packets enables the demultiplexer 42 to determine the PIDs of each elementary stream (i.e., video and audio) of the selected program. Once these PIDs are known, the demultiplexer 42 simply extracts every Transport Packet from the incoming Transport Stream whose header contains one of those PIDs. For example, referring back to FIG. 1, a subscriber may select Program 1 which consists of elementary streams "Video 1" and "Audio 1." Transport Packets carrying the Packetized Elementary Stream data for "Video 1" each have a PID of `10`, and the Transport Packets carrying the Packetized Elementary Stream data for "Audio 1" each have a PID of `23`. As successive packets of the Transport Stream are received, the demultiplexer 42 will extract every incoming Transport Packet having a PID of `10` or `23`. Extracted Transport Packets will then be parsed in order to reassemble the original Packetized Elementary Streams. Ultimately, the coded video and audio data of each Packetized Elementary Stream will be provided to respective buffers 48, 54, and then to respective decoders 50, 56 which decode the data to produce analog video and audio signals for output to a display device.
In addition, as each Transport Packet of the selected program is received, the demultiplexer 42 determines whether that Transport Packet contains a PCR. If so, the PCR is extracted from the incoming packet and provided to the clock generation circuit 58 via line 59. As explained above, it is highly unlikely that the frequency of a decoder's system clock will be exactly the same as that of the original encoder, or that the decoder's system clock will be perfectly stable (i.e, will not drift). Accordingly, the PCR values, which are sent periodically in the Transport Stream, are used to correct the decoder's system clock as needed. Correction of the system clock is performed by the clock generation circuit 58.
As illustrated in FIG. 2, the clock generation circuit 58 implements a straightforward phase-lock-loop (PLL) except that the reference and feedback terms are numbers (e.g., the values of counter 66 and received PCRs). Upon initial acquisition of a user selected program, the counter 66 is loaded via line 61 with the first PCR received for that program. Thereafter, the PLL essentially operates as a closed loop. A voltage controlled oscillator (VCO) 64 having a nominal frequency substantially equal to that of the encoder system clock provides the decoder system clock signal. As the decoder system clock runs, the clock signal increments counter 66 which therefore represents the absolute time of the decoder system clock. As shown, the value of counter 66 is continuously fed back to a subtractor unit 60. Subtractor 60 compares the counter value with subsequent PCRs as they arrive in the Transport Stream Packets. Since a PCR, when it arrives, represents the correct timebase for the selected program, the difference between it and the value of counter 66 may be used to drive the instantaneous frequency of the VCO 64 to either slow down or speed up the decoder clock signal, as appropriate. A low-pass filter and gain stage (LPF) 62 is applied to the difference values from the subtractor 60 to produce a smooth control signal for the VCO 64. As can be appreciated, the continuous feedback provided by counter 66 and the periodic arrival of PCR values in the Transport Stream, ensure that the decoder system clock remains slaved to the encoder system clock. (Note: although the transmitted PCR's establish a timebase for a given program, synchronization of the audio and video elementary streams to the timebase of the program is accomplished using "presentation time stamps" which are carried in the PES packet headers of the respective Packetized Elementary Stream.)
Use of PCR's in the manner described above will accurately synchronize a decoder's system clock to an encoder's system clock so long as any delay in transmission of the MPEG-2 Transport Stream from the encoder to the decoder is constant for every Transport Packet in that stream. Unfortunately, in some transmission mediums, variable packet delays may be imposed on individual packets of the Transport Stream. For example, it is generally recognized that in the future, there will be a need to transmit MPEG-2 Transport Streams through Asynchronous Transfer Mode (ATM) networks. One problem likely to be encountered during transmission of an MPEG-2 Transport Stream through an ATM network is that certain Transport Packets are likely to experience variable delays (i.e., "jitter") as they pass through the network. For example, variable delays are likely to result from queuing delays at ATM switching nodes in the network. Such delays will undoubtedly change the order and relative temporal location of Transport Packets travelling through an ATM network, and therefore, will also change the relative order and temporal location of PCR's carried in those packets. Any PCR's of a given program that are delayed more or less than average will no longer be valid, since their values will no longer accurately reflect the value of the encoder system clock when they ultimately reach a decoder. For example, if one PCR experiences a delay greater than the average delay experienced by other PCR's, that PCR will arrive later than its value would indicate. If the delay is large enough, the clock generation circuit and/or buffers in the decoder may not be able to recover from the discrepancy between the expected and received PCR values.
Accordingly, there is a need for methods and apparatus for adjusting the timestamps in a datastream, such as the PCR's in an MPEG-2 Transport Stream, to account for delays experienced while the datastream propagates through the switching nodes of an ATM network. The present invention satisfies this need.