This relates to data processing and, more specifically, to a method and apparatus for encoding and decoding information in an efficient manner, both from the standpoint of the number of bits employed to describe the data and from the standpoint of the encoding process. Still more particularly, the method and apparatus of this invention relate to encoding and decoding of image information.
Image data compression has been a topic of considerable interest in recent years because of the burgeoning fields of information and communication. Although a picture is worth a thousand words, with unadorned techniques it may take more transmission capacity to handle a single picture than many thousands of words. Compared to text, the information content in an image is normally quite large and, therefore, the efficient transmission and/or storage of pictorial information has been subjected to extensive research for a number of years.
A number of factors have come into play that assist in that research. First, in many instances, the images stored or transmitted are static. No movement needs to be communicated, and that relaxes the storage and transmission requirements. Second, users find it acceptable for motion to have a limited temporal resolution. That too relaxes the storage requirements. Third, sometimes the information need not be handled on a real-time basis. Consequently, there is time to indulge in rather sophisticated processing techniques.
Another reason why image data compression is possible is that images contain a considerable amount of superfluous information. Generally, one can identify two kinds of superfluous information. Statistical redundancy, having to do with similarities in the data representing the image; and subjective redundancy, having to do with data similarities which can be removed without user complaints. Statistical redundancy is illustrated by the transmission of a white page which may be depicted without repeatedly specifying (at every pixel) the color of the page. Subjective redundancy is illustrated by the fact that in depicting movement one can ignore events that occur faster than a certain speed, because the human eye cannot discern those events.
Researchers have tried to encode images in a number of ways that realize transmission and storage savings by eliminating the aforementioned redundancies. These encoding approaches can be broadly divided into predictive Pulse Code Modulation, Interpolative or Extrapolative Coding, Transform Coding, and Vector Quantization.
In predictive coding, such as in the Differential Pulse Code Modulation (DPCM) approach, an attempt is made to predict the pixel to be encoded. The prediction is made by using the encoded values of the previously encoded pixels. Generally, these pixels are combined to generate a predicted value; and the signal that is quantized, encoded and transmitted is the difference between the actual value and the generated predicted value. Adaptive DPCM, where the prediction algorithm is based on local picture statistics, is a variation on this approach.
In interpolative coding, only a subset of the extrapolative coding, only a subset of the pixels is sent to the receiver. The receiver must then interpolate the available information to develop the missing pixels.
In transform coding, instead of coding the image as discrete intensity values on sets of sampling points, an alternative representation is made first by transforming blocks of pixels into a set of coefficients. It is the coefficients that are quantized and transmitted. Several transformations have been used in the art, such as the Hadamard, the Karhunen-Loeve, and the Discrete Cosine Transforms. These transforms, being unitary, conserve the signal energy in the transform domain but, typically, most of the energy is concentrated in relatively few samples. Predominately, it is the samples representing the low frequencies that are non-zero. Samples that are zero, or ones that are very close to zero, need not be transmitted. Still additional samples of low value can be quantized coarsely, resulting in substantial compression of the data that needs to be either stored or transmitted.
Yet another encoding technique is Vector Quantization, where an image block is decomposed into a set of vectors. From the possible (or experienced) signal patterns and corresponding vectors, a subset of representative vectors is selected and included in a code book. When encoding, the developed sets of vectors are replaced with the closest representative vectors in the code book, and compression is achieved by further replacing the representative vectors with labels. At the receiver, the inverse operation takes place with a table look-up which recovers the representative vectors. A facsimile of the image is reconstructed from the recovered representative vectors.
The decoding process in vector quantization is quite simple, but that is not the case on the encoding side. Primarily this is because one must develop the code book, and that becomes impractical when the image block is large or when good fidelity is desired. Also, to develop those representative vectors one must use a "training" set of data.
It is an object of this invention, therefore, to provide an image compression approach that is both efficient and easy to implement.