As is well-known in the art, MPEG (i.e., MPEG-1, MPEG-2, MPEG-4, H.264 (MPEG-4 AVC)) compressed video and audio streams are mapped by an encoder into MPEG-2 transport streams as elementary streams (ES) packed into packetized elementary stream (PES) packets, which, in turn, are packed in MPEG-2 transport stream (TS) packets. The PES packets contain a PES header which contains, among other things, a presentation time stamp (PTS) and, optionally, a decoding time stamp (DTS) (in case the DTS is not present, it is considered equal to the PTS). The DTS tells the decoder when to decode a video/audio frame, while the PTS tells the decoder when to display (i.e., present) the video/audio frame. Both the DTS and PTS values are actually time events that are relative to a time reference that is also transmitted in the MPEG-2 Transport Stream. This time reference is called the system time clock (STC) and is coded in the TS program clock reference (PCR) field as samples of a 27 MHz counter (clock).
All MPEG compression standards define a standard for the decoder. The decoder decodes at a constant bit rate, while the encoder encapsulates variable bit rate information. There is no standard for an encoder other than it must generate a decoder compliant bitstream. This allows various vendors to optimize their encoders in unique ways. The encoder uses a video buffering verifier (VBV) model to ensure the variable rate information can be decoded properly (when buffer level feedback is not available from the decoder). The VBV model is designed to mimic the operation of a decoder and is well known in the art.
MPEG compression standards also define a standard for the STCs used to generate the PCRs. The STC must clock at 27 MHz±810 Hz and drift no more than one cycle every 13⅓ seconds (ISO/IEC 13818-1, 2.4.2.1). Different encoders will therefore operate at slightly different rates. These differences (different timestamps) are carried over when video/audio streams are generated by different encoders and, optionally, recorded for later playback (advertisements). The decoder must be able to synchronize its clock with the encoder clock used to encode the TS in order to present properly timed video and to “lip sync” the audio. Decoders usually have a phase-locked loop (PLL) that incorporates a drift adjustment used to match the encoder clock frequency. The decoder uses the incoming PCRs to synchronize the PLL to the encoder clock.
For video data, MPEG provides a high degree of compression by encoding blocks of pixels using various techniques and then using motion compensation to encode most video frames as predictions from or between other frames. In particular, the encoded MPEG video stream is comprised of a series of groups of pictures (GOPs), and each GOP begins with an independently encoded I-frame (intra-coded frame) and may include one or more following P-frames (predictive-coded frame) and B-frames (bi-directionally predictive-coded frame). Each I-frame can be decoded without additional information. Decoding of a P-frame requires information from a preceding frame in the GOP. Decoding of a B-frame requires information from a preceding and a following frame in the GOP. To minimize decoder buffer requirements, each B-frame is transmitted in reverse of its presentation order, so that all the information of the other frames required for decoding the B-frame will arrive at the decoder before the B-frame.
Splicing of audio/visual programs is a common operation performed, for example, whenever one encoded television program stream is switched to another or when an ad stream is inserted into the current program stream. Splicing of MPEG encoded audio/visual streams is considerably more difficult than splicing uncompressed audio and video, and a number of problems result, namely:                The P and B-frames cannot be decoded without a preceding I-frame, so that cutting into a stream after an I-frame renders the P and B-frames meaningless        The P and B-frames are considerably smaller than the I-frames, so that the frame boundaries are not evenly spaced and must be dynamically synchronized between the two streams at the time of the splice by adjusting DTS, PTS, and PCR timestamps (the DTS-PCR (DTS minus PCR, i.e., time until decode) values may be different between the two streams)        If the PCR timestamps are not adjusted properly the PLL of the decoder may become unsynchronized or lose lock        The video decoder buffer is required to compensate for the uneven spacing of the frame boundaries in the encoded streams so splicing may cause underflow or overflow of the video decoder buffer (when two segments of an MPEG compressed video stream, both of which are compliant with the MPEG decoder buffer model, are “glued” together, then in general the resulting MPEG stream may not comply with the VBV model)                    e.g., the information in a program stream has an encode rate that is compliant with the VBV model, and advertisement “ad” is spliced in with a higher encode rate; the resulting stream can overflow the decoder buffer                        
These problems, if unmitigated, can result in unwelcome video and audio effects for the customer. In order to solve these problems, traditional MPEG splicing solutions use transcoding or requantizing to modify the size of the video frames around the splice points in order to generate a valid video/audio stream. To do this, a splicer needs to dissect the frame information and modify it. This requires expensive hardware and/or software (transraters). What is needed is a simple software solution to splice two TSs together to form a decoder compliant TS.