The present invention relates to decoders for processing image data which has been compressed according to a format, MPEG-2, specified by the moving pictures experts group (MPEG) and in particular to a preprocessing step which selectively deletes stuffing bits from the MPEG-2 data stream prior to decoding.
Video signal compression performed under the MPEG-2 standard is inherently variable rate. Video data is compressed based on the spatial frequency content of either a sequence of images or the difference among the images in the sequence. If an image sequence has low spatial frequency content or if successive images differ only slightly, the amount of compressed data that is transmitted to reproduce the image sequence may be greatly reduced.
The syntax for the MPEG-2 standard is set forth in International Standard 13818-2 Recommendation ITU-T H.262 entitled xe2x80x9cGeneric Coding of Moving Pictures and Associated Audio Information: Video,xe2x80x9d available from ISO/IEC, Geneva, Switzerland, and which is incorporated herein by reference for its teaching of the MPEG-2 video coding standard. This standard defines several layers of data records which are used to convey both audio and video data. For the sake of simplicity, the decoding of the audio data is not described herein. Encoded data which describes a particular video sequence is represented in several nested layers, the Sequence layer, the Group of Pictures layer, the Picture layer, the Slice layer and the Macroblock layer. Each layer, except the Macroblock layer, begins with a start code that identifies the layer. The layer includes header data and payload data. To aid in transmitting this information, a digital data stream representing multiple video sequences is divided into several smaller units and each of these units is encapsulated into a respective packetized elementary stream (PES) packet. For transmission, each PES packet is divided, in turn, among a plurality of fixed-length transport packets. Each transport packet contains data relating to only one PES packet. The transport packet also includes a header which holds control information, sometimes including an adaptation field, to be used in decoding the transport packet.
When an MPEG-2 encoded image sequence is received, a transport decoder decodes the transport packets to reassemble the PES packets. The PES packets, in turn, are decoded to reassemble the MPEG-2 bit-stream which represents the image in the layered records, as described above. A given transport data stream may simultaneously convey multiple image sequences, for example as interleaved transport packets. This flexibility also allows the transmitter to concurrently transmit multiple bit-streams, each corresponding to a respective audio, video or data program.
A system implementation for delivering HDTV using MPEG-2 standards to the consumer, in general, as illustrated in high-level block diagram of FIG. 1. On the transmission side, video and audio signals are input to respective encoders 110 and 112, buffered in buffers 114 and 116, delivered to the system coder/multiplexer 118, and stored in storage unit 120 or transmitted by transmitter unit 120. On the receiving side, the signals are received by a system decoder/demultiplexer, 122, buffered in buffers 124 and 126, then decoded by decoders 128 and 130 and output as a reproduction of the original video and audio signals.
An important aspect of the illustration of FIG. 1 is that, although the intermediate stage buffering of the signals includes a variable delay, the overall delay from input to output of the signals is desirably substantially constant. This is accomplished by monitored flow control and buffers.
As indicated in FIG. 1, the delay from the input to the encoder to the output or presentation from the decoder is constant in this model, while the delay through each of the encoder and decoder buffers is variable. Not only is the delay through each of these buffers variable within the path of one elementary stream, the individual buffer delays in the video and audio paths differ as well. Therefore, the relative location of coded bits representing audio or video in the combined stream does not indicate synchronization information. The relative location of coded audio and video is constrained only by a System Target Decoder (STD) model such that the decoder buffers must behave properly; therefore, coded audio and video that represent sound and pictures which are to be presented simultaneously may be separated in time within the coded bit system by as much as one second, which is the maximum decoder buffer delay that is allowed in the STD model. In order to accommodate the data latency inherent in the STD model, a Video Buffering Verifier (VBV) is defined.
The VBV is a hypothetical decoder, which is conceptually connected to the output of an encoder. An encoded bit-stream is stored into a VBV buffer memory of the hypothetical decoder until a sufficient amount of data has been stored to ensure that a decoder decoding the bit-stream will not run out of data (underflow) or process data too slowly (overflow) when the data is received at a fixed rate. Coded data is removed from the buffer as defined below. To conform to the MPEG-2 standard, a typical MPEG-2 video decoder includes a memory buffer, the VBV buffer, which holds an amount of bit-stream data specified by a value, vbv_buffer_size_value which is transmitted as a part of the header of the Sequence layer.
A high-level illustration of an exemplary STD model operating in conjunction with an encoder is shown in FIG. 2.
The requirement that the VBV buffer or STD model decoders not underflow is important to maintain the quality of the received image. In order to maintain constant bitrate video, xe2x80x9cstuffingxe2x80x9d is implemented within various aspects of the system. xe2x80x9cStuffingxe2x80x9d is the act of filling the data stream with xe2x80x9cdon""t carexe2x80x9d information simply to maintain the required bit-rate.
Stuffing is implemented at two levels. In the MPEG-2 video standard, any number of zero-valued stuffing bits may be inserted into the bit stream immediately before a start code for one of the layers or before an extension start code. Stuffing is also implemented in the transport packets as one-valued stuffing bits inserted into an adaptation field in the transport packet. Stuffing is used in transport packets when there is insufficient PES packet data to fill the payload bytes of the transport packet to a level that would support the transmitted data rate.
It has been recognized for some time that the stuffing bits represent wasted bandwidth in the MPEG-2 signal which could be used for other purposes. For example, in U.S. Pat. No. 5,650,825, entitled METHOD AND APPARATUS FOR SENDING PRIVATE DATA INSTEAD OF STUFFING BITS IN AN MPEG BIT STREAM, stuffing data in adaptation fields of transport packets is replaced by private stuff data which is received and separately processed by a user.
The present invention is embodied in an MPEG-2 decoder which identifies and removes stuffing data from an MPEG-2 bit-stream before storing the bit-stream into the VBV buffer.
According to one aspect of the invention, the decoder includes a bit-stream parser which passes a predetermined maximum number of stuffing bits before each start code.
According to another aspect of the invention, the parser passes different numbers of stuffing bits to be posed before a start code before Slice start codes and other start codes.
According to yet another aspect of the invention, a value indicating the number of stuffing bits passed is provided to the parser by a microprocessor and may be programmed.