Digital transmission of video signals has become more widely used in recent years, particularly in the consumer electronics industry. This growth in the use of digital video signal transmission and reception in digital versatile disc (DVD) players and digital video broadcasting (DVB) set-top-box applications, for example, has led to improved picture quality in the transmitted sequence of images and the ability to more effectively control storage, manipulation and display of the video signal over existing NTSC and PAL analog transmission systems. In furtherance of these advances, the industry sponsored Moving Pictures Expert Group (MPEG), chartered by the International Organization for Standardization (ISO), has specified formats for digital video compression, i.e., the syntax for encoding video bit streams, which are defined in two standards, ISO-11172-2 (MPEG-1) and ISO-13818-2 (MPEG-2). During the discussion to follow, the reader is hereby referred to ISO11172-2 (MPEG-1) and ISO-13818-2 (MPEG-2) for a more detailed description of the bit stream syntax used to digitally encode video signals according to these standards. Each of these standards is hereby expressly incorporated herein by reference in its entirety.
The bit stream syntax defined by the MPEG-1 and -2 standards relates to three general types of information or data in the bit stream, namely control information which is necessary to define the bit stream, control information which is necessary to properly decompress and reproduce the transmitted sequence of images, and the encoded video data. The bit stream control information may identify that the bit stream is packetized video or audio data, or that the bit stream is a video signal encoded using either the MPEG-1 or -2 standard, for example. Image control information may include, as an example, the frame horizontal and vertical size, i.e., the number of picture elements (pels) per line and number of lines per frame, the frame or field rate, and the aspect ratio. As will be described in more detail below, the encoded video data represents the DCT transformed and quantized chrominance and luminance pet values that are necessary for recreation of each frame or field.
The MPEG-1 and -2 standards each specify a bit stream syntax designed to improve information density and coding efficiency by methods that remove spatial and temporal redundancies. Each MPEG picture is divided into a series of macroblocks which are made up of 16×16 luminance pels (Y), or a 2×2 array of four 8×8 transformed blocks of pels. Each macroblock is further made up of 8×16 chrominance pels (U,V), or a 1×2 array of two 8×8 blocks of pels. During the encoding process, spatial redundancies are removed by using Discrete Cosine Transform (DCT) coding of the 8×8 pet blocks followed by quantization, zigzag scan, and variable length coding of runs of zero (run-length) and amplitude levels. Motion compensated prediction is used to remove temporal redundancies.
For video, MPEG contemplates Intra (I-) frames, Predictive (P-) frames and Bidirectionally Predictive (B-) frames. The, I-frames are independently coded and are the least efficiently coded of the three frame types. P-frames are coded more efficiently than the I-frames and are coded relative to the previously coded I- or P-frame. B-frames are coded the most efficiently of the three frame types and are coded relative to both the previous and the next I- or P-frame. The coding order of the frames in an MPEG system is not necessarily the same as the presentation order of the frames. Headers in the bit stream provide information to be used by the decoder to properly decode the time and sequence of the frames for presentation of a moving picture.
Typical video decoders that are used for decoding digitally transmitted video bit streams have a micro-controller or sequencer for controlling a variable length decoder (VLD) that is designed to parse the bit stream for decoding of the quantized DCT coefficients and motion vectors using the MPEG variable length code tables (VLC's). An inverse transform processor is used to transform each block of quantized coefficient values into a stream of values representing the inverse zigzag of the block and to dequantize the values. The dequantized DCT coefficients are passed to an inverse discrete cosine transform (IDCT) processor that performs an inverse DCT transform operation to recover the chrominance an luminance pel values. These values are then applied, in combination with the decoded motion vectors, to a motion compensation (MC) processor which then performs the MPEG decompression to convert I-, P- and B-frames into full video frames.
In typical VLD architectures for performing MPEG syntax compliant bit stream parsing and decoding, a predetermined number of undecoded bits of the video bit stream are stored in one or more registers. The VLD extracts a smaller number of these bits from the register(s) with the leftmost bit always aligned as the first bit extracted by the VLD. The VLD then performs a table look-up in one of the MPEG VLC tables to decode the variable length encoded video data and obtain the code length. After the variable length code in the extracted bits has been decoded, the VLD performs a MASK/SHIFT/OR operation on the bits in the register(s) to realign the first unused bit in the leftmost position of the register(s). The VLC tables are typically contained in one or more PALS or ROMs which have approximately 2n×m memory locations, where “n” represents the maximum possible variable code length in each of the VLC tables and “m” represents the number of unique VLC tables.
It will be appreciated by those skilled in the art that the SHIFT/MASK/OR operation required for alignment of the unused bits after the decode process in certain VLD architectures will significantly affect the overall decode speed of the VLD. Since each of these operations may require one or more cycles, the decode efficiency of the VLD is significantly decreased as multiple cycles are required by the VLD to decode each DCT coefficient symbol (i.e., each run-length and amplitude level pair) and then realign the unused bits. Additionally, the VLC table structure in certain VLD architectures adds cost and complexity to the VLD as each variable length code of each unique VLC table is stored in a separate memory location.
Thus, there is a need for a VLD that efficiently decodes variable length DCT coefficients and motion vectors which have been encoded according to the MPEG-1 or -2 standard. There is also a need for a VLD that minimizes the amount of memory required to decode the various MPEG variable length codes. There is yet also a need for a VLD that is able to receive instructions from a micro-sequencer in accordance with a predefined set of instructions, and further to receive instructions from a master controller.