1. Field of the Invention
This invention generally relates to video signal processing and, more particularly, to a system and method of forming packets for transport with adjustable time stamps, that in turn permits a greater tolerance in the decoding and presentation timing.
2. Description of the Related Art
FIG. 1 is a diagram depicting the MPEG-2 packetized elementary stream (PES) packet format (prior art). As noted by Anderson et al. in U.S. Pat. No. 6,356,567, all of the fields after the PES packet length are optional. The PES packet has a PES header, an optional header, and payload. The PES header has bit start code, a packet length field, a 2-bit “10” field, a scramble control field, a priority field, a data alignment field, a copy field, a PTS/DTS (Presentation Time Stamp/Decoding Time Stamp) field, a field for other flags, and a header length field.
The “Optional Header” field includes a Presentation Time Stamp field, a Decoding Time Stamp field, an elementary stream clock reference field, an elementary stream rate field, a trick mode field, a copy info field, a Prior Packetized Elementary Stream Clock Recovery field, an extension, and stuffing.
The packet start code provides packet synchronization. The stream ID field provides packet identification. Payload identification is also provided by the stream ID. The PTS/DTS flag fields and the PTS/DTS fields provide presentation synchronization. Data transfer is provided through the packet/header length, payload, and stuffing fields. The scramble control field facilitates payload descrambling, the extension/private flag fields and the private data fields provide private information transfer.
A transport stream (TS) may contain one or more independent, individual programs, such as individual television channels or television programs, where each individual program can have its own time base, and each stream making up an individual program has its own PID. Each separate individual program has one or more elementary streams (ES) generally having a common time base. Different transport streams can be combined into a single system transport stream. Elementary stream (ES) data, that is, access units (AU), are first encapsulated into packetized elementary stream (PES) packets, which are, in turn, inserted into transport stream (TS) packets, as shown in FIG. 2.
The architecture of the transport stream (TS) packets under the MPEG-2 specifications is such that the following operations are enabled: (1) demultiplexing and retrieving elementary stream (ES) data from one program within the transport stream, (2) remultiplexing the transport stream with one or more programs into a transport stream (TS) with a single program, (3) extracting transport stream (TS) packets from different transport streams to produce another transport stream (TS) as output, (4) demultiplexing a transport stream (TS) packet into one program and converting it into a program stream (PS) containing the same program, and (5) converting a program stream (PS) into a transport stream (TS) to carry it over a lossy medium to thereafter recover a valid program stream (PS).
At the transport layer, the transport sync byte provides packet synchronization. The Packet Identification (PID) field data provides packet identification, demultiplexing, and sequence integrity data. The PID field is used to collect the packets of a stream and reconstruct the stream. The continuity counters and error indicators provide packet sequence integrity and error detection. The Payload Unit start indicator and Adaptation Control are used for payload synchronization, while the Discontinuity Indicator and Program Clock Reference (PCR) fields are used for playback synchronization. The transport scramble control field facilitates payload descrambling. Private data transfer is accomplished through the Private Data Flag and Private Data Bytes. The Data Bytes are used for private payload data transfer, and the Stuffing Bytes are used to round out a packet.
A transport stream is a collection of transport stream packets, linked by standard tables. These tables carry Program Specific Information (PSI) and are built when a transport stream is created at the multiplexer. These tables completely define the content of the stream. Two of the tables of the transport stream are the Program Association Table (PAT) and the Program Map Table (PMT).
The Program Association Table is a table of contents of the transport stream. It contains an ID that uniquely identifies the stream, a version number to allow dynamic changes of the table and the transport stream, and an association table of pairs of values. The pairs of values, PN, and PMT-PID, are the Program Number (PN) and the PID of the tables containing the program.
The Program Map Table is a complete description of all of the streams contained in a program. Each entry in the Program Map Table is related to one and only one program. The role of the Program Map Table is to provide a mapping between PID streams and programs. The program map table contains a program number that identifies the program within the transport stream, a descriptor that can be used to carry private information about the program, the PID of the packets that contain the synchronization information (PCRs), a number of pairs of values (ST, Data-PID) which, for each stream, specify the stream type (ST), and the PID of the packets containing the data of that stream or program (Data-PID). There is also a Network Information Table used to provide a mapping between the transport streams and the network, and a Conditional Access Table that is used to specify scrambling/descrambling control and access.
In use, the tables are used to select and reconstruct a particular program. At any point in time, each program component has a unique PID in the Program Map Table. The Program Map Table provides the PIDs for the selected program's audio, video, and control streams. The streams with the selected PIDs are extracted and delivered to the appropriate buffers and decoders for reconstruction and decoding.
Achieving and maintaining clock recovery and synchronization is a problem, especially with audio and video bitstreams. The MPEG-2 model assumes an end-to-end constant delay timing model in which all digital image and audio data take exactly the same amount of time to pass through the system from encoder to decoder. The system layer contains timing information that requires constant delay. The clock references are Program clock reference (PCR) and the time stamps are the Presentation Time Stamp/Decoding Time Stamp (PTS/DTS).
The decoder employs a local system clock having approximately the same 27 Megahertz frequency as the encoder. However, the decoder clock cannot be allowed to free run. This is because it is highly unlikely that frequency of the decoder clock would be exactly the same as the frequency of the encoder clock. Synchronization of the two clocks is accomplished by the Program Clock Reference (PCR) data field in the packet adaptation field of the PCR PID for the program. The Program Clock Reference values can be used to correct the decoder clock. Program Clock Reference, or PCR, is a 42 bit field. It is coded in two parts, a PCR Base having a 33-bit value in units of 90 kHz, and a PCR extension having a 9-bit extension in units of 27 MHz, where 27 MHz is the system clock frequency.
As a general rule, the first 42 bits of the first PCR received by the decoder initialize the counter in a clock generation, and subsequent PCR values are compared to clock values for fine adjustment. The difference between the PCR and the local clock can be used to drive a voltage controlled oscillator, or a similar device or function, for example, to speed up or slow down the local clock.
Audio and video synchronization is typically accomplished through the Presentation Time Stamp (PTS) inserted in the Packet Elementary Stream (PES) header. The Presentation Time Stamp is a 33-bit value in units of 90 kHz, where 90 kHz is the 27 MHZ system clock divided by 300. The PTS value indicates the time that the presentation unit should be presented to the user.
The system layer timing information, PCR and PTS/DTS, keep the encoder and decoder in synchronization, with the PCR values correcting the decoder clock. The timing information, PCR and PTS/DTS, arrive at the decoder about every 10-100 milliseconds for the PCR, and at least as frequently as about every 700 milliseconds for the PTS/DTS. Processing and filtering the timing signals consumes significant processor resources. This is because the clock signals are in mixed number bases, the clock signals can arrive at widely varying times, there is no way to sort out necessary interrupts from unnecessary interrupts, and, most important of all, errors in clock management are directly visible and/or audible through buffer overflow or underflow and color disturbance. However, as noted above, the relationship between PCR and the STC values are used to drive a voltage controlled oscillator or similar device. The voltage controlled oscillator or similar device speeds up or slows down the local clock driving the STC.
FIG. 2 depicts the program multiplexing process in the formation a transport stream (prior art). MPEG-2 Systems (ISO/IEC 13818-1) has become the standard for digital TV broadcast. In a digital broadcast environment each allocated frequency slot (i.e. 6 MHz) carries one MPEG-2 transport stream, and there are multiple programs (virtual channels) carried within each transport stream. In order for the transport stream to be able to carry multiple programs, the bit rate of the transport stream must be higher than the combined bit rates of these programs. Null packets are inserted into the transport stream to maintain a constant bit rate, if the combined bit rate of the programs becomes lower than the transport stream constant bit rate.
When individual programs are encoded, each video and audio frame receives a Decoding Time Stamp (DTS) and Presentation Time Stamp (PTS). The PTS is always later than or equal to the DTS. Due to the bit stream sharing (multiplexing), the packets may not always be inserted into the transport stream at the desired time instance. A certain amount of delay for each frame is inevitable.
When the bit rate of the transport stream is much higher than the combined bit rates of all the programs carried in it, it is easy for the multiplexer to properly arrange the appearance of packets from each program. In this case, there are many null packets inserted into the transport stream to take up the unoccupied time slots. However, when the transport stream gets crowded, that is, when the free space margin shrinks, the packet delay becomes more severe. Although the programs are encoded with a nominal bit rate constraint, in some instances an individual program may incur a sudden spike in the bit rate. When the bit rates spikes, the delay of packets becomes even more severe. Depending on the decoder's design, the delayed arrival of frame data in a delayed packet may cause a noticeable (to a viewer) visual defect. In the case where an I-frame has encountered a long delay, the decoder may drop that frame due to expiration. Then, the following P-frames can't be successfully decoded until the next I-frame arrives, even if they are delivered on time.
It would be advantageous if a MPEG-2 transport stream could be made more tolerant of timing delays.
It would be advantageous if the DTS and PTS time stamps could be adjusted in response to the time that associated frames are multiplexed into the transport stream.