This invention relates to video signal transmission techniques known as transform coding and sub-band coding. More particularly, the present invention relates to a particular coding scheme that uses transform coding, which can be implemented using a very simple sub-band structure with low hardware complexity.
The Broadband Integrated Services Digital Network (BISDN) is a new concept for exchange area communications. Based on lightwave technology, high-speed circuit and packet switching, and intelligent networking, BISDN will integrate various services such as voice, data, and video under the same umbrella as POTS (Plain Old Telephone Service). Initially, video is likely to be one of the more attractive services and can therefore be expected to present a significant portion of the traffic to the BISDN network. Conversely, high quality video signals represent the service most in need of the broad bandwidth of BISDN. For these reasons, the digital coding of video signals in a form suitable for transmission through the proposed BISDN is of great interest and importance.
The coding of high-definition television (HDTV) signals for transmission through the BISDN is particularly important since the emerging fiber optic network may provide the most viable means for delivering very high image quality and avoiding transmission impairments. The digital coding, transmission, and switching of video signals will offer considerable advantages over conventional analog distribution in terms of image quality, control, and in the variety of services that can be offered to the customer. In the network, digitally encoded TV signals are robust, i.e., they are not susceptible to the gradual increase of distortion and noise that typically accumulates during the various stages of transmission, multiplexing, and switching of conventional analog-encoded signals. Also, digital video signals are much easier to integrate (switching, multiplexing, etc.) with other digital services and they can be readily encrypted to provide security or control. Digital codecs (coder/decoders) can be designed to take advantage of VLSI (very large scale integrated) circuits to realize the digital signal processing required to remove redundancy from the signal and thus reduce the bit rate to the available line rate.
Various image compression techniques have been investigated for removing the inherent redundancy in video signals. One commonly used technique uses the decomposition of the signal into different frequency components/bands for separate encoding. Because the human visual system (HVS) is less sensitive to high band errors, it is possible to code these more coarsely thereby achieving a coding gain. Many coding techniques that decompose the signal into multiple bands have been proposed and they have proved to be more effective than straightforward one-band coding like DPCM. Examples are, transform coding, (see e.g. N. Ahmed, T. Natarajan, and K. R. Rao, "Discrete cosine transform," IEEE Trans. Comput., vol. C-23, pp. 90-93, January 1974; W. H. Chen, C. H. Smith, and S. C. Fralick, "A fast computational algorithm for the discrete cosine transform," IEEE Trans. Commun., vol COM-25, pp. 1004-1009, September 1977; and K. H. Tzou, T. C. Chen, P. E. Fleischer and M. L. Liou, "Compatible HDTV coding for broadband ISDN," IEEE Globecom, Florida, December 1988), sub-band coding (see e.g. H. Gharavi and A. Tabatabai, "Sub-band coding of monochrome and color images," IEEE Trans. Circuits Systems, vol. CAS-35, pp. 207-214, February 1988; D. Le Gall, H. Gaggioni and C. T. Chen, "Transmission of HDTV signals under 140 Mbits/s using a sub-band decomposition and discrete cosine transform coding," Proc. 2nd Int. Workshop on Signal Processing of HDTV, 29 Feb.-2 Mar. 1988, L'Aquila Italy; and D. Le Gall and A. Tabatabai, "Sub-band coding of digital images using symmetric short kernel filters and arithmetic coding techniques," Proc. ICASSP-88, pp. 761-764, New York, N.Y., Apr. 11-14, 1988), and pyramid coding (see e.g. P. J. Burt and E. H. Adelson, "The Laplacian pyramid as a compact image code," IEEE Trans. Comm., vol. COM-31, pp. 532-540, April 1983; and T. C. Chen, K. H. Tzou and P. E. Fleischer, "A hierarchical HDTV coding system using a DPCM-PCM approach," SPIE Visual Comm. Image Proc., vol 1001, Cambridge, Mass., November 1988). Some basic literature related to transform coding and sub-band coding can also be found in W. K. Pratt, "Digital image processing," New York: Wiley, 1978; W. H. Chen and W. K. Pratt, "Scene adaptive coder," IEEE Trans. Commun., vol. 32, no. 3, 1984; and M. J. T. Smith and T. P. Barnwell, "Exact reconstruction techniques for tree-structured subband coders," IEEE Trans. ASSP, vol. ASSP-34, no. 3, June 1986.
In general, transform coding and sub-band coding are alternative image compression techniques. In transform coding, blocks of pixel information are transformed using a prescribed signal transformation. In particular, each input block is transformed into an equal-sized block of transform coefficients by using algorithms that combine the pixel magnitude information in the block in a prescribed manner. The resultnat coefficients in the transformed blocks are quantized and otherwise compressed and then transmitted to a receiver where the received coefficients in each block are inversely transformed back to the pixel domain. Since the perceived video quality is more dependent on certain of the coefficients than others in each transformed block, coding efficiency is obtained by using coarser quantization schemes for the less significant coefficients. The size of the input blocks is another factor affecting coding efficiency in transform coding. It is generally agreed that larger block-size image transform coding decorrelates the signal better and thus generates a lower coding rate. A problem attending the use of large block sizes is that the resulting image quality is not consistent, blocking and ringing artifacts being often visible. In terms of picture quality, it is preferable to have a smaller block-size transform so that quantizers can be designed to fit the signal better and to minimize the quantization noise spread.
In sub-band coding the input signal is decomposed into several bands that are separably decimated and coded for the purpose of transmission. Unlike transform coding in which blocks of pixels need be stored and processed, sub-band coding is processed in real time. For reconstruction after transmission, the individual bands are decoded, interpolated, filtered and added in order to reproduce the original signal. For video signals, separable filter banks may be applied both horizontally and vertically. Each of the resulting bands may then be further decomposed. After decomposition, each band is encoded according to its own statistics. The higher frequency bands that least affect the human perception of video quality are quantized with coarser quantizing schemes than the lower frequency bands.
The choice of the analysis and synthesis filter banks that are used to decompose and reconstruct the video signal is a very important part of the design of a sub-band coder. One approach is to use Quadrature Mirror Filters (QMFs) which, in the absence of channel and quantization noise, permit an alias-free and near perfect reconstruction of the input signal (see e.g. D. Esteban and C. Galand, "HQMF: Halfband Quadrature Mirror Filters," Proc. 1981 Int. Conf. Acoust. Speech Signal Processing, pp. 220-223, April, 1981). A disadvantage of the QMF approach is that the resulting filters do not permit exact reconstruction of the original video signal, although amplitude distortion can be made small by using long, finite impulse response filters. Such long, multiple-tap filters are difficult to implement, particularly for high speed applications such as the coding of HDTV signals. Short kernel filters, on the other hand, allow simpler implementation but the frequency responses of such short kernel filters do not have the sharp and symmetrical transition characteristics of QMF pairs. A class of short kernel symmetric analysis/synthesis filter banks with a perfect reconstruction capability is presented in the aforenoted reference by Le Gall and Tabatabai and in U.S. Pat. No. 4,829,378 issued June 8, 1988 to Le Gall. In that reference, a simple two-tap finite impulse response (FIR) filter is considered but rejected for practical use for video coding for the reason that the type of artifacts introduced by quantization noise make the resulting picture visually unpleasant.
An object of the present invention is to achieve high compression coding of video signals using techniques that can be implemented for high speed applications with low hardware complexity.