1. Field of Invention
This invention relates to the processing and transfer of data according to the standard adopted by the Moving Picture Experts Group (MPEG), specifically, to a method of processing an MPEG-2 transport stream to ensure a smooth transition with a similarly processed transport stream.
2. Description of Prior Art
The MPEG-2 standard consists of the International Standards Organization (ISO) standards 13818-1 Systems, 13818-2 Video and 13818-3 Audio, which are hereby incorporated by reference.
MPEG-2 is most commonly used in television systems. Applications of MPEG-2 exist both in broadcast environments and in storage environments. In a broadcast environment, content is encoded on a continuous basis and all changes in source material must be made prior to the encoding process. In a storage environment, content is encoded and stored. A video file server may transmit the stored, precompressed streams at a later time.
The transport stream output from an MPEG-2 storage encoder typically consists of a single program of a defined length which was encoded according to a set of user specified configuration parameters. Because the encoder was designed to create content in an isolated environment for storage purposes, it is likely that the implications of decoding in a broadcast system were not fully considered. For example, while it may be possible to create a stream with an exact number of frames, there may still be a timing discontinuity if two such streams are played consecutively on a decoder. This discontinuity would be visible as a roll or tear on the monitor.
Use of a broadcast encoder to generate transport stream segments is also problematic. In this case, the control system may not easily permit setting encoding parameters with fine temporal granularity. For example, it may not be possible to specify that a particular frame should be encoded as an I-Picture, or it may be difficult to coordinate the encoder and storage equipment to capture a stream that is exactly the program duration.
In Video on Demand (VOD) or Near Video on Demand (NVOD) applications, a video server transmits pre-compressed programs at scheduled times. It is highly desirable for the consumer to view the decoded stream as a continuous experience without objectionable artifacts or transitions. This goal is a challenge because each program may be encoded and stored individually, possibly from different encoders.
Although the video server may be able to stream back-to-back programs from two files without interruption, errors may still be perceived at the transition depending upon how the stored transport streams begin and end. Because transport streams are coded hierarchically, discontinuities may occur at multiple layers in the stream. The resulting artifacts vary in severity, depending upon the nature of the discontinuity. At a worst case, a full reacquisition of the transport stream will occur when both audio and video decoders lose synchronization at the transition. The reconstructed video may exhibit a roll or tear when the monitor resynchronizes due to a discontinuity in the frame timing. There may also be a chrominance loss or shift if there is a time base discontinuity. The displayed video may exhibit blocky artifacts that persist for several frames if predictive frames reference an incorrect anchor frame or if an incomplete picture is sent at the transition. Audio may dropout, pop, or hiss if incomplete sync frames are decoded.
In an article by S. Marrill Weiss in the SMPTE Journal, December 1995, titled "Switching Facilities in MPEG-2: Necessary But Not Sufficient", the problems associated with concatenating separately compressed MPEG-2 streams are discussed. Several extensions to MPEG-2 are proposed. The author does not claim that these extensions will answer all needs or that they will completely solve any of the problems. Rather they are put forth to provide a basis for further developments. The approach is based on imposing constraints on how the transport stream is generated. The extensions proceed by first restricting the transport rate so that each video access unit period is an integral number of transport packets. The transport stream duration is restricted to one of a number of standardized lengths. An additional constraint limits the GOP, or I frame refresh, rate to a fixed number of frames. Specific transition points are also defined in the transport stream, and the encoder must fill the buffer for all elementary streams to a defined level at the transition points. There are several problems with this approach. For example, the requirement of meeting a specific buffer fullness may adversely affect the quality of the video since the optimal bit allocation between pictures will have to be changed. Additionally, constraining the transport rate will not be possible for many applications, such as satellite transmission, where the transport rate is a function of the channel.
U.S. Pat. No. 5,534,944 to Egawa et al. (1996) proposes a solution to the splicing of video streams wherein stuffing bits are used at the splicing point to position the decoder's buffer at the appropriate level for the new video stream. While this approach will avoid buffer problems, it does not address several problems associated with transitions between previously encoded transport streams. For example, the presence of non-video elementary streams, splicing in the middle of a video stream where there may be odd fields or missing anchor frames, allowing streams to be spliced together in any order, and the timing problems associated with the transport layer are not addressed.