Video images are rarely stored or transmitted in a computer system in an uncompressed format because of the vast amounts of memory required to store an unaltered and uncompressed video image. One color picture at standard television resolution requires approximately one million bytes to store the pixel representation of the image digitally. An uncompressed digitized color photograph at 35 mm resolution occupies approximately ten times more memory, approximately ten megabytes. Digital storage of a single roll of 35 mm film would require an entire 360 megabyte hard drive if the images were not compressed and it would take almost an hour to receive a single digitized photograph over a standard modem link if the image were not compressed.
In order to make digital storage and transmission of video images practicable, various techniques for compressing digital images have been devised. One of the most widely used techniques for video compression is the so-called JPEG standard, an internationally recognized compression standard which has been propounded by the Joint Photographic Expert Group a working group of the International Electrotechnical Commission and a collaboration between ITU and ISO that is applicable to continuous-tone (multilevel) images.
The JPEG standard uses adaptive transformation to achieve high compression. In adaptive transformation, the values of a group of pixels are transformed into another set that requires less data. The specific adaptive transformation used is the discrete cosine transform, a version of the Fourier transform that accounts for the fact that the input is essentially a limited set of samples of continuous waveforms. The discrete cosine transform is "lossy" because forward and inverse (i.e., encoding and decoding) discrete cosine transform equations contain transcendental functions and cannot be physically implemented with perfect accuracy. Moreover, the JPEG standard is also lossy because of a round off error introduced by quantization which produces a significant portion of the data compression. Additional data compression is achieved by statistical coding of the quantized discrete cosine transform frequency components, either arithmetically (e.g., Q-coding) or by Huffman coding. JPEG encoding can typically achieve compression ratios of 10:1 without significant degradation of image quality or resolution, and, in some cases, compression as high as 30:1 can be achieved.
The JPEG standard does not specify the details of particular algorithms that must be used for compliance with the JPEG standard. For example, the JPEG standard does not specify the implementation details of the discrete cosine transform and there are numerous distinct products that all incorporate substantially different implementations of discrete cosine transform encoders and decoders. The discrete cosine transform tends to be extremely processing intensive, which frequently results in a noticeable delay in decompression of a received (or retrieved) image. In an effort to ameliorate this delay, there are numerous prior art implementations of JPEG decoders that incorporate approximations of the discrete cosine transform that exchange accuracy for image quality.
Another area of the JPEG standard in which there is extensive variation between different implementations is statistical or entropy or binary encoding of the quantized discrete cosine transform coefficients output by the penultimate quantizer stage. Binary encoding, such as run-length encoding, is applied to each quantized discrete cosine transform coefficient that does not equal zero, producing an output including the number of preceding zeros, (i.e., the run length), the number of bits needed to describe the coefficient, and the value of the coefficient.
In much the same way that it does not require a particular implementation of the discrete cosine transform, the JPEG standard does not require a particular implementation of the coding scheme. In JPEG standard compression, the number of preceding zeros and the number of bits needed to describe the coefficient value form a pair which is associated with a code word assigned through a variable length code such as Huffman or Arithmetic coding. The value of the coefficient is also associated with a codeword defined in a variable length code and it is these codewords which comprise the variable length data symbols of the compressed video bit stream.
The JPEG standard is only one of many compression schemes that uses variable length encoding. The so-called MPEG (Motion Picture Expert Group of the International Electrotechnical Commission (IEC)) standards (MPEG-1 and MPEG-2) also utilize variable length encoding as well as many other aspects of JPEG encoding. The Digital Video (DV) standard also uses variable length encoding.
There is a common problem with JPEG, DV, MPEG-1, and MPEG-2, that problem being the difficulty of implementing an efficient and practicable way of decoding the compressed video when it is received or retrieved. In contemporary high performance graphics computer systems such as those manufactured by Silicon Graphics, there is a need for a system that is capable of decompressing variable length encoded video bit streams without requiring excessively large look up tables and/or an excessive number of table accesses.
The size of look up tables is constrained because excessively large look up tables precludes implementation in an ASIC (Application Specific Integrated Circuit) which is necessary to provide optimum performance and maximum flexibility. However, in the variable length codes used in JPEG, MPEG, or DV compression schemes, the code symbols are typically two to sixteen bits in length. Accordingly, the look up tables potentially have 64K entries(e.g., 2.sup.16 or 65,536, corresponding to 16 bits). This problem is compounded when color images are compressed because, depending upon the color space, multiple tables are typically required to account for the different components. For example, in a luminance/chrominance color space such as YCbCr, two tables are required, one for luminance and one for chrominance. Thus, 128K table entries are required, each table entry being 16 bits long. This situation is further exacerbated by the use of separate tables for the two different types of discrete cosine transform spatial frequency coefficients (DC and AC, as discussed below), which can substantially increase the number of tables and table entries required.
The number of times the look up tables can be accessed to decode each variable length encoded code word is limited by the desire to quickly perform video decoding to support high data rate bit streams. In one prior art embodiment, small look up tables are used to ameliorate the problem of impracticably large tables, with the small look up tables arranged in a branched tree data structure and pointer logic used to maintain track of the decoded value during video decoding. However, as many as four separate table look up operations are required to decode a single variable length encoded symbol and the additional overhead required for pointer maintenance further impedes optimal decoding of the compressed video data.