The present invention relates to decoding of digitally encoded video signals and, more particularly, to a decoder for decoding video data and control information which have been encoded using fixed length values and variable length codes.
Digital transmission of video signals has become more widely used in recent years, particularly in the consumer electronics industry. This growth in the use of digital video signal transmission and reception in digital versatile disc (DVD) players and digital video broadcasting (DVB) set-top-box applications, for example, has led to improved picture quality in the transmitted sequence of images and the ability to more effectively control storage, manipulation and display of the video signal over existing NTSC and PAL analog transmission systems. In furtherance of these advances, the industry sponsored Moving Pictures Expert Group (MPEG), chartered by the International Organization for Standardization (ISO), has specified formats for digital video compression, i.e., the syntax for encoding video bit streams, which are defined in two standards, ISO-11172-2 (MPEG-1) and ISO-13818-2 (MPEG-2). During the discussion to follow, the reader is hereby referred to ISO-11172-2 (MPEG-1) and ISO-13818-2 (MPEG-2) for a more detailed description of the bit stream syntax used to digitally encode video signals according to these standards. Each of these standards is hereby expressly incorporated herein by reference in its entirety.
The bit stream syntax defined by the MPEG-1 and -2 standards relates to three general types of information or data in the bit stream, namely control information which is necessary to define the bit stream, control information which is necessary to properly decompress and reproduce the transmitted sequence of images, and the encoded video data. The bit stream control information may identify that the bit stream is packetized video or audio data, or that the bit stream is a video signal encoded using either the MPEG-1 or -2 standard, for example. Image control information may include, as an example, the frame horizontal and vertical size, i.e., the number of picture elements (pels) per line and number of lines per frame, the frame or field rate, and the aspect ratio. As will be described in more detail below, the encoded video data represents the DCT transformed and quantized chrominance and luminance pel values that are necessary for recreation of each frame or field.
The MPEG-1 and -2 standards each specify a bit stream syntax designed to improve information density and coding efficiency by methods that remove spatial and temporal redundancies. Each MPEG picture is divided into a series of macroblocks which are made up of 16xc3x9716 luminance pels (Y), or a 2xc3x972 array of four 8xc3x978 transformed blocks of pels. Each macroblock is further made up of 8xc3x9716 chrominance pels (U, V), or a 1xc3x972 array of two 8xc3x978 blocks of pels. During the encoding process, spatial redundancies are removed by using Discrete Cosine Transform (DCT) coding of the 8xc3x978 pel blocks followed by quantization, zigzag scan, and variable length coding of runs of zero (run-length) and amplitude levels. Motion compensated prediction is used to remove temporal redundancies.
For video, MPEG contemplates Intra (I-) frames, Predictive (P-) frames and Bidirectionally Predictive (B-) frames. The I-frames are independently coded and are the least efficiently coded of the three frame types. P-frames are coded more efficiently than the I-frames and are coded relative to the previously coded I- or P-frame. B-frames are coded the most efficiently of the three frame types and are coded relative to both the previous and the next I- or P-frame. The coding order of the frames in an MPEG system is not necessarily the same as the presentation order of the frames. Headers in the bit stream provide information to be used by the decoder to properly decode the time and sequence of the frames for presentation of a moving picture.
Typical video decoders that are used for decoding digitally transmitted video bit streams have a micro-controller or sequencer for controlling a variable length decoder (VLD) that is designed to parse the bit stream for decoding of the quantized DCT coefficients and motion vectors using the MPEG variable length code tables (VLC""s). An inverse transform processor is used to transform each block of quantized coefficient values into a stream of values representing the inverse zigzag of the block and to dequantize the values. The dequantized DCT coefficients are passed to an inverse discrete cosine transform (IDCT) processor that performs an inverse DCT transform operation to recover the chrominance an luminance pel values. These values are then applied, in combination with the decoded motion vectors, to a motion compensation (MC) processor which then performs the MPEG decompression to convert I-, P- and B-frames into full video frames.
In typical VLD architectures for performing MPEG syntax compliant bit stream parsing and decoding, a predetermined number of undecoded bits of the video bit stream are stored in one or more registers. The VLD extracts a smaller number of these bits from the register(s) with the leftmost bit always aligned as the first bit extracted by the VLD. The VLD then performs a table look-up in one of the MPEG VLC tables to decode the variable length encoded video data and obtain the code length. After the variable length code in the extracted bits has been decoded, the VLD performs a MASK/SHIFT/OR operation on the bits in the register(s) to realign the first unused bit in the leftmost position of the register(s). The VLC tables are typically contained in one or more PALS or ROMs which have approximately 2nxc3x97m memory locations, where xe2x80x9cnxe2x80x9d represents the maximum possible variable code length in each of the VLC tables and xe2x80x9cmxe2x80x9d represents the number of unique VLC tables.
It will be appreciated by those skilled in the art that the SHIFT/MASK/OR operation required for alignment of the unused bits after the decode process in certain VLD architectures will significantly affect the overall decode speed of the VLD. Since each of these operations may require one or more cycles, the decode efficiency of the VLD is significantly decreased as multiple cycles are required by the VLD to decode each DCT coefficient symbol (i.e., each run-length and amplitude level pair) and then realign the unused bits. Additionally, the VLC table structure in certain VLD architectures adds cost and complexity to the VLD as each variable length code of each unique VLC table is stored in a separate memory location.
Thus, there is a need for a VLD that efficiently decodes variable length DCT coefficients and motion vectors which have been encoded according to the MPEG-1 or -2 standard. There is also a need for a VLD that minimizes the amount of memory required to decode the various MPEG variable length codes. There is yet also a need for a VLD that is able to receive instructions from a micro-sequencer in accordance with a predefined set of instructions, and further to receive instructions from a master controller.
The present invention is embodied in a variable length video decoder that is particularly suited for decoding MPEG-1 and -2 syntax compliant video bit streams. The video decoder is designed as a single event per cycle slice parsing engine for decoding the macroblock and block layers of individual slices to generate DCT coefficient values and motion vectors.
The video decoder incorporates a micro-sequencer that interfaces with a VLD command decode/execution unit to control the variable length decoding process according to the MPEG standard. During the decoding process, the micro-sequencer either issues commands to the VLD command decode/execution unit for performing variable length decoding or controls the program flow as provided through its instruction set which is stored in instruction ROM. The video decoder is further able to receive decode instructions from a RISC CPU that is responsible for upper layer parsing and controlling the overall decoding process for reconstruction of the decoded sequence of images.
Encoded video data is stored in DRAM memory and made available to the video decoder through a channel buffer FIFO. In accordance with one aspect of the present invention, a predetermined number of these encoded video data bits are made visible to the video decoder and a variable length table decoder through the use of a rotator/barrel shifter and pointer register. The barrel shifter and pointer register make the bits from pointer address to pointer address+31 visible as rotator/barrel shifter data to both the video decoder and a variable length table decoder. The video decoder is responsible for decoding the variable length code in the rotator/barrel shifter data to obtain the necessary DCT coefficients and motion vectors for each slice. After the variable length code is decoded, the pointer register of the rotator/barrel shifter is incremented to prepare for the next decode cycle.
In accordance with another aspect of the present invention, a novel scheme is provided to enable the video decoder to access each of the MPEG VLC tables to obtain the necessary decoded value. Each MPEG VLC table is divided into a series of subtables as defined by a unique prefix pattern identified in each of the tables. During the variable length decode process, the 32 bits of extracted rotator/barrel shifter data are applied to a pattern match logic and MUX control in the variable length table decoder to identify the unique prefix pattern in the rotator/barrel shifter data. In parallel, the bits after the prefix pattern are applied to all of the subtables in each of the MPEG VLC tables. After the variable length encoded data has been decoded, the variable length table decoder provides the decoded value and a valid code status bit. The variable length table decoder also provides a code length signal to the pointer register of the rotator/barrel shifter to increment the pointer register by the code length.
In accordance with yet another aspect of the present invention, the decoded DCT coefficients are stored as compressed run-length and amplitude level pairs in a run-level decoder/FIFO. The run-level decoder/FIFO decompresses the run-length and amplitude level pairs into DCT coefficients as needed by an inverse transform unit. This allows decoding of the Huffman encoded variable length pairs in parallel with the run-level decoding of previously decoded run-level pairs. Motion vectors are stored in a mv/dmv FIFO until needed by a motion compensation unit.
The above and other aspects, objects and advantages of the present invention shall be made apparent from the accompanying drawings and the description thereof.