1. Field of the Invention
This invention relates to information signal processing in general and in particular to the field of processing "natural" information signals, such as video signals or audio signals, for the purpose of forming a compact data file of compressed signals which can be expanded to reproduce the information in the original information signals.
2. Description of the Prior Art
Video data compression in recent years has achieved increasing importance because of the advances in communications and the general increase in transfer of information. Typically, video data is comprised of video images and each video image is a frame comprised of individual picture elements, pixels. The pixels form lines, sometimes called rows, in the horizontal direction and columns in the vertical direction. The number of pixels per line and the number of lines per frame depends upon both the video format used to represent the frame and the rate at which the frame was digitized.
To digitize a frame containing a black-and-white image, each pixel in the frame is normally assigned a number between 0 and 255, for example, where 0 would mean the pixel was completely black, and 255 would mean that the pixel was completely white. Numbers between 0 and 255 represent the various shades of gray. Thus, to digitally represent such an image, each pixel requires eight bits to represent the number corresponding to the gray-level of the pixel.
For a color image where each pixel has a red component (R), a green component (G), and a blue component (B) and each component has an intensity that varies from 0 to 255, three times more information is required to represent the color image than a black-and-white image having the same number of pixels. Accordingly, any system, which utilizes numerous digitized colored pictures or numerous digitized black-and-white pictures, processes formidable amounts of data. For example, in the transmission of a black-and-white television picture, data rates for unprocessed digitized television signals typically require a communications channel with a bandwidth greater than 40 megabytes per second.
Several processes have been developed for reducing the quantity of data required to represent a video image. These processes compress the data representing the image. The compressed data is usually transmitted over a data channel and when the compressed data is received it is expanded to recreate a likeness of the original image.
Two general methods have been developed for compression of video data, (1) spatial or time domain compression, and (2) transform domain compression. Transform domain compression, sometimes called transform domain coding, typically results in better compression performance but is considerably more difficult to implement in real-time hardware. In transform domain coding, the original set of binary data representing the pixels is processed by an invertible mathematical transform so that the original data, which are correlated in the space domain or time domain, are mapped into a new coordinate system, called the transform or generalized frequency domain, where the data are much less correlated.
The mathematical transforms are chosen so that they preserve the signal energy of the original image in the transform domain, but the energy is concentrated in a relatively few samples which are usually the lower frequency samples. Accordingly, compression can be achieved by considering these high energy samples to be sufficient for reconstruction after transmission storage or processing. Alternatively, as described below, the transform coefficients are encoded using methods developed in information theory so that data representing the original picture is first compressed by the transform which concentrates the data in fewer points and then the transformed data are further compressed by quantization and encoding of the transform coefficients. To recreate the original image, the encoded transform coefficients are decoded, inverse quantized, and inverse transformed. The quality of the reconstructed image is directly dependent on the errors introduced by the transform, quantization, and the encoding-decoding processes.
The Karhunen-Loeve (KL) transform is usually identified as the optimal transform for decorrelating the data in the transform domain and packing a maximum energy in a given number of samples. However, there are two generally recognized problems with the KL transform. One, the KL transform is unique for only one class of signals, and two, a fast KL transform algorithm is not known. Accordingly, alternative mathematical transforms have been investigated. The discrete cosine transform is generally used for transform domain coding of video images because a fast discrete cosine transform algorithm exists and the cosine transform has been shown to be virtually identical to the KL transform for numerous practical conditions.
In the traditional discrete cosine transform compression methods, the video frame is divided into a series of nonoverlapping blocks. Typically, a block is sixteen pixels wide and sixteen pixels high. The discrete cosine transform of a two dimensional block is implemented by transforming the digital data for the pixels in a first direction and then transforming in the second direction. The resulting cosine transform coefficients include a single term which represents the average signal energy in the block, sometimes referred to as the DC term, and a series of terms, sometimes referred to as the AC terms, which represent the variation of the signal energy about the DC component for the block.
A quantizer is used to reduce the range of the cosine transform coefficients. A quantizer is a mapping from the continuous variable domain of transform coefficients into the domain of integers. Commonly used is the uniform quantizer, which may be specified by a number. The number is divided into each transform coefficient with the resulting quotient rounded to the nearest integer. The quantized cosine transform coefficients are then encoded for transmission over a data channel.
There are two basic coding techniques, adaptive coding techniques and nonadaptive coding techniques. With an adaptive coding technique, the transform, quantization, and coding of the video image produce compressed data at a variable rate, but the transmission of the compressed data representing the video image over a communication channel is at a fixed rate. Therefore, the adaptive compression system employs a buffer which interfaces the variable rate compression system with the fixed rate communication channel. The buffer is typically designed to provide variable feedback of compression parameters so that the coding of the transform coefficients is adapted to the conditions in the buffer. This form of adaptive coding limits the quality of the compression because the process is governed by the state of the buffer and not by considerations which provide optimal compression.
Irrespective of either the coding scheme for the discrete cosine transform coefficients or the quantization method, the methods or processes which utilize a discrete cosine transform generally suffer from a common problem. For each block compressed and reconstructed, utilization of the cosine transform results in greater pixel error at the edges of the block with the error decreasing towards the center of the block. Since each of the blocks is coded independently, the distortion introduced by the compression scheme using the cosine transform is discontinuous at each block boundary. Highly correlated errors, that is one error on top of the other such as those introduced by the cosine transform at the block boundaries, stand out and are highly visible to the human eye.
Farrelle and Jain suggested in "Recursive Block Coding --A New Approach to Transform Coding," IEEE Transactions on Communications, Vol. COM-34, No. 2, pp. 161-179, Feb. 1986, a new method for compression of image data which is designed to minimize the highly correlated errors which occur at block edges using conventional cosine transform coding. In this method, referred to as Recursive Block Coding, each block is further subdivided into corner, edge and interior elements which are coded separately. Previously coded elements within the block are combined with elements from adjacent blocks to form a prediction of all pixels in the next block element to be coded. This prediction is used to reduce the complexity of the data to be compressed and to reduce error at the edge pixels. Farrelle and Jain show that the theoretical and practical performance of the recursive block coding is superior to the conventional block by block discrete cosine transform methods and the block effect distortion is substantially reduced. In recursive block coding, the use of previously coded block elements for prediction yields a residual signal which has relatively small correlated errors at block boundaries. Farrelle and Jain, using computer simulation, developed their method using a discrete sine transform with uniform quantizers and nonadaptive zonal transform coefficient coding techniques. Thus, while the Recursive Block Coding technique combined with the discrete sine transform minimizes the correlated boundary error problems of the prior art discrete cosine transform methods, the method of Farrelle and Jain is limited by the use of uniform quantizers and nonadaptive coding techniques. Further, demonstration of the method does not establish that the method is suitable for a video compression system that must operate with constraints on the time, memory and storage space available for compression.
The present invention overcomes the problems of the prior art by providing a system for developing data bases of compressed video images, as well as data bases of other analog data that are amenable to digitization, which eliminates the boundary errors of the discrete cosine transform methods and utilizes both nonuniform quantization and adaptive coding techniques to achieve optimal compression while maintaining image quality. The compression system provides a fast means of data compression that can be implemented in a wide variety of applications.