The present invention relates to the communication of digital video signals, and more particularly to a method and apparatus for implementing an inverse discrete cosine (IDCT) processor to recover transform coefficients.
Television signals are conventionally transmitted in analog form according to various standards adopted by particular countries. For example, the United States has adopted the standards of the National Television System Committee ("NTSC"). Most European countries have adopted either PAL (Phase Alternating Line) or SECAM (Sequential Color And Memory) standards.
Digital transmission of television signals can deliver video and audio services of much higher quality than analog techniques. Digital transmission schemes are particularly advantageous for signals that are broadcast by satellite to cable television affiliates and/or directly to home satellite television receivers. It is expected that digital television transmitter and receiver systems will replace existing analog systems just as digital compact discs have largely replaced analog phonograph records in the audio industry.
A substantial amount of digital data must be transmitted in any digital television system. This is particularly true where high definition television ("HDTV") is provided. In a digital television system, a subscriber receives the digital data stream via a receiver/descrambler that provides video, audio, and data to the subscriber. In order to most efficiently use the available radio frequency spectrum, it is advantageous to compress the digital television signals to minimize the amount of data that must be transmitted.
The video portion of a television signal comprises a sequence of video "frames" that together provide a moving picture. In digital television systems, each line of a video frame is defined by a sequence of digital data referred to as "pixels." A large amount of data is required to define each video frame of a television signal. For example, 7.4 megabits of data is required to provide one video frame at NTSC resolution. This assumes a 640 pixel by 480 line display is used with 8 bits of intensity value for each of the primary colors red, green and blue. High definition television requires substantially more data to provide each video frame. In order to manage this amount of data, particularly for HDTV applications, the data must be compressed.
Video compression techniques enable the efficient transmission of digital video signals over conventional communication channels. Such techniques use compression algorithms that take advantage of the correlation among adjacent pixels in order to derive a more efficient representation of the important information in a video signal.
One of the most effective and frequently used classes of algorithms for video compression is referred to as "transform coders." In such systems, blocks of video are linearly and successively transformed into a new domain with properties significantly different from the image intensity domain. The blocks may be nonoverlapping, as in the case of the discrete cosine transform (DCT), or overlapping as in the case of the lapped orthogonal transform (LOT). Systems using the DCT are described in Chen and Pratt, "Scene Adaptive Coder," IEEE Transactions on Communications, Vol. COM-32, No. 3, March 1984, and in U.S. Pat. No. 4,791,598 entitled "Two-Dimensional Discrete Cosine Transform Processor" to Liou, et al., issued Dec. 13, 1988. A system using the LOT is described in Malvar and Staelin, "The LOT: Transform Coding Without Blocking Effects," IEEE Transactions on Acoustics, Speech, and Signal Processing, Vol. 37, No. 3, April 1989.
Video transforms are used to reduce the correlation that exists among samples of image intensity (pixels). Thus, these transforms concentrate the energy into a relatively small number of transform coefficients. Most common transforms have properties that easily permit the quantization of coefficients based on a model of the human visual system. For example, the DCT produces coefficients with amplitudes that are representative of the energy in a particular band of the frequency spectrum. Therefore, it is possible to utilize the fact that the human viewer is more critical of errors in the low frequency regions of an image than in the high frequency or detailed areas. In general, the high frequency coefficients are always quantized more coarsely than the low frequencies.
The output of the DCT is a matrix of coefficients which represent energy in the two-dimensional frequency domain. Most of the energy is concentrated at the upper left corner of the matrix, which is the low frequency region. If the coefficients are scanned in a zigzag manner, starting in the upper left corner, the resultant sequence will contain long strings of zeros, especially toward the end of the sequence. One of the major objectives of the DCT compression algorithm is to create zeros and to bunch them together for efficient coding.
Coarse quantization of the low frequency coefficients and the reduced number of nonzero coefficients greatly improves the compressibility of an image. Simple statistical coding techniques can then be used to efficiently represent the remaining information. This usually involves the use of variable length code words to convey the amplitude of the coefficients that are retained. The smaller amplitudes which occur the most frequently are assigned short code words. The less probable large amplitudes are assigned long code words. Huffman coding and arithmetic coding are two frequently used methods of statistical coding. Huffman coding is used in the system of Chen and Pratt referred to above. Arithmetic coding is described in Langdon, "An Introduction to Arithmetic Coding," IBM Journal for Research Development, Vol. 28, No. 2, March 1984.
In order to reconstruct a video signal from a stream of transmitted coefficients, it is necessary to perform the inverse of the transform (e.g., DCT) that was used to encode the signals. Typically, the transform coefficients are communicated in n.times.n blocks of coefficients, such as 8.times.8 or 16.times.16 blocks. In order to build a practical system, it is advantageous to implement the IDCT processor on a integrated circuit chip, such as a very large scale integration (VLSI) design. Ideally, the VLSI design will calculate the IDCT quickly, accurately and with minimal hardware. In reality, the size of the VLSI hardware increases as the speed and accuracy of the IDCT circuit go up. Thus, trade-offs must be made to provide a compact VLSI design that provides sufficient speed and accuracy.
Previously noted U.S. Pat. No. 4,791,598 discloses a DCT processor that can be used as part of a video bandwidth or image compression system. A first one-dimensional DCT processor simultaneously computes an entire row or column of vector inner products by using distributed arithmetic and decimation-in-frequency to reduce the amount of memory capacity required. Partial sums are used to further reduce the memory size. The one-dimensional transformed matrix from the first processor is stored in a transposition memory and the transpose of the stored matrix is applied to a second one-dimensional DCT processor of similar circuitry which computes the desired two-dimensional DCT of the input data matrix. The DCT processor can be implemented on a single chip.
A disadvantage of the DCT processor disclosed in U.S. Pat. No. 4,791,598 is that it requires two separate one-dimensional DCT processors. The DCT processor disclosed in the patent also processes coefficient data only one bit at a time, rendering real-time processing difficult.
It would be advantageous to provide an inverse discrete cosine transform processor that provides real-time operation and can be implemented in a straightforward manner in VLSI. It would be further advantageous to provide an IDCT implementation in which a plurality of bits from each coefficient are processed during each clock cycle, to facilitate the throughput of data and enable real-time operation with a reasonable hardware size. It would be still further advantageous to implement the IDCT processor using bit serial arithmetic.
The present invention provides an IDCT processor having the aforementioned advantages.