The present invention relates to the recovery of compressed digital data, and more particularly to apparatus for decoding variable length code words.
Television signals are conventionally transmitted in analog form according to various standards adopted by particular countries. For example, the United States has adopted the standards of the National Television System Committee ("NTSC"). Most European countries have adopted either PAL (Phase Alternating Line) or SECAM standards.
Digital transmission of television signals can deliver video and audio services of much higher quality than analog techniques. Digital transmission schemes are particularly advantageous for signals that are broadcast by satellite to cable television affiliates and/or directly to home satellite television receivers. It is expected that digital television transmitter and receiver systems will replace existing analog systems just as digital compact discs have largely replaced analog phonograph records in the audio industry.
A substantial amount of digital data must be transmitted in any digital television system. This is particularly true where high definition television ("HDTV") is provided. In a digital television system, a subscriber receives the digital data stream via a receiver/descrambler that provides video, audio, and data to the subscriber. In order to most efficiently use the available radio frequency spectrum, it is advantageous to compress the digital television signals to minimize the amount of data that must be transmitted.
The video portion of a television signal comprises a sequence of video "frames" that together provide a moving picture. In digital television systems, each line of a video frame is defined by a sequence of digital data referred to as "pixels." A large amount of data is required to define each video frame of a television signal. For example, 7.4 megabits of data is required to provide one video frame at NTSC resolution. This assumes a 640 pixel by 480 line display is used with 8 bits of intensity value for each of the primary colors red, green and blue. High definition television requires substantially more data to provide each video frame. In order to manage this amount of data, particularly for HDTV applications, the data must be compressed.
Video compression techniques enable the efficient transmission of digital video signals over conventional communication channels. Such techniques use compression algorithms that take advantage of the correlation among adjacent pixels in order to derive a more efficient representation of the important information in a video signal. The most powerful compression systems not only take advantage of spatial correlation, but can also utilize similarities among adjacent frames to further compact the data. In such systems, differential encoding is used to transmit only the difference between an actual frame and a prediction of the actual frame. The prediction is based on information derived from a previous frame of the same video sequence. Examples of such systems can be found in U.S. Pat. No. 5,068,724 entitled "Adaptive Motion Compensation for Digital Television" and U.S. Pat. No. 5,057,916 entitled "Method and Apparatus for Refreshing Motion Compensated Sequential Video Images."
Motion estimation of a video signal is provided by comparing the current luminance block with the luminance blocks in the previous frame within a specified tracking range. The previous frame luminance block with the minimum total absolute change compared to the current block is chosen. The position of the chosen block is called the motion vector, which is used to obtain the predicted values of the current block. For additional coding efficiency, the motion vectors can be differentially encoded and processed by a variable length encoder for transmission as side information to a decoder. A low pass filter may be provided in the DPCM loop for the purpose of smoothing out the predicted values as necessary. In order to protect the coded bitstream from various kinds of random noise, a forward error correction scheme can be used.
There are two major categories of coding schemes for compressing the data rate by removing redundant information. These are "source coding" and "entropy coding." Source coding deals with source material and yields results that are lossy. Thus, picture quality is degraded when source coding is used. In implementing source coding techniques, either intraframe or interframe coding can be used. Intraframe coding is used for the first picture and for later pictures after a change of scene. Interframe coding is used for sequences of pictures containing moving objects. Entropy coding achieves compression by using the statistical properties of the signals and is, in theory, lossless.
A coding algorithm that uses both source coding and entropy coding has been proposed by the CCITT Specialist Group. See, e.g., "Description of Reference Model 8 (RM8)," Doc. No. 525, CCITT SG XV Working Party XV-4, Specialist Group on Coding for Visual Telephony, June, 1989. In the CCITT scheme, a hybrid transform/differential pulse coded modulation (DPCM) with motion estimation is used for source coding. The DPCM is not operative for intraframe coding. For entropy coding, both one- and two-dimensional variable length codings are used.
The discrete cosine transform (DCT) described by N. Ahmed, T. Natarajan, and K. R. Rao, "Discrete Cosine Transform," IEEE Trans. Computer, Vol. C-23, pp. 90-93, January 1974, is used in the CCITT system to convert the input data, which is divided into macroblocks and sub- blocks, into transform coefficients. The DCT transform is performed on the difference between blocks of current frame data and corresponding blocks of a predicted frame (which is obtained from the previous frame information). If a video block contains no motion or the predicted value is exact, the input to the DCT will be a null matrix. For slowly moving pictures, the input matrix to the DCT will contain many zeros. The output of the DCT is a matrix of coefficients which represent energy in the two-dimensional frequency domain. In general, most of the energy is concentrated at the upper left corner of the matrix, which is the low frequency region. If the coefficients are scanned in a zigzag manner, the resultant sequence will contain long strings of zeros especially toward the end of the sequence. One of the major objectives of this compression algorithm is to create zeros and to bunch them together for efficient coding.
To maintain efficiency, a variable threshold is also applied to the coefficient sequence before quantization. This is accomplished by increasing the DCT threshold when a string of zeros is detected. A DCT coefficient is set to zero if it is less than or equal to the threshold.
A uniform quantizer is used after the transform. The step size of the quantizer can be adjusted by the transmission rate as indicated by the occupancy of a buffer. When the transmission rate reaches its limit, the step size will be increased so that less information needs to be coded. When this occurs, a degraded picture will result. On the other hand, picture quality will be improved by decreasing the step size when the transmission rate is below its limit.
To further increase coding efficiency, a two-dimensional variable length coding scheme is used for the sequences of quantized DCT coefficients. In a given sequence, the value of a non-zero coefficient (amplitude) is defined as one dimension and the number of zeros preceding the non-zero coefficient (runlength) is defined as another dimension. The combination of amplitude and runlength is defined as an "event."
A shorter length code is assigned to an event which occurs more frequently. Conversely, infrequent events receive longer length codes. An EOB (end of block) marker is provided to indicate that there are no more non-zero coefficients in the sequence.
The coded coefficient values are multiplexed together with various side information such as block classification, quantization information, and differential motion vectors. Some of the side information may also be variable length coded. The resultant bitstream is sent to a buffer for transmission.
At a receiver, a variable length decoder is necessary to perform the inverse operation of the encoder and recover the transform coefficients. Although the architecture of the decoder is in general much simpler than the encoder, prior art decoders require substantial amounts of memory in order to store a code book that is required to convert the received code words back into the transform coefficients from which they were derived at the transmitter.
Variable length codes have been proposed in which no code word is the prefix of any other code word. This guarantees unique decodability of an incoming data stream. Compression is achieved when events that occur much more frequently than others are assigned the shortest code words. In the proposed CCITT video coding algorithm, the dimension of the event amplitudes is 256, and the runlength has a dimension of 64. A straightforward implementation of such a system would require a variable length code table having more than 16,000 entries. However, since more than 99% of the entries are statistically improbable, they can be represented by a 6-bit escape code followed by 14-bit fixed length fields, in which six bits are provided for runlength and eight bits are provided for amplitude. The resulting variable length code table contains only 128 entries, which is much easier to process. Indeed, the coding and decoding of such a variable length code can be accomplished using a lookup table stored in read only memory (ROM).
Decoding in such a scheme is somewhat complicated by the fact that the length of the variable length code must be determined before it can be decoded. Several techniques for variable length code decoding have been proposed in the past. See, e.g., U.S. Pat. No. 3,701,111 to Cocke, et al entitled "Method and Apparatus for Decoding Variable-Length Codes Having Length-Indicating Prefixes," and M. T. Sun, K. M. Yang, and K. H. Tzou, "High-Speed Programmable ICs for Decoding of Variable-Length Codes," Applications of Digital Image Processing XII, Andrew Tescher, Ed., Proc. SPIE Vol. 1153, August 1989. The latter article proposes a parallel approach employing a barrel shifter and programmable logic arrays (PLA) or content addressable memory/random access memory modules (CAM/RAM) for very large scale integration (VLSI) implementation of a variable length decoder.
The decoders proposed in the prior art have only been realizable in software simulations or using large amounts of discrete hardware components due to the high speed nature of real time digital video decompression. It would be advantageous to provide a variable length decoder that has the ability to process code words at real time video rates. It would be further advantageous to provide such a decoder that can be readily implemented in integrated circuit form. Still further, it would be advantageous to provide such a decoder that consumes only a small amount of power. Such a decoder would be particularly useful for consumer use, such as in a low cost high definition television receiver.
The present invention provides a variable length decoder having the aforementioned and other advantages.