The ability to store and transmit full-color, full-motion images is increasingly in demand. These images are used, not only for entertainment, as in motion picture or television productions, but also for analytical and diagnostic tasks such as engineering analysis and medical imaging.
There are several advantages to providing these images in digital form. For example, the images are easier to enhance and manipulate and, as with all digital signals, digital video images can be regenerated accurately over several generations with only minimal signal degradation.
On the other hand, digital video requires significant memory capacity for storage and, equivalently, a high-bandwidth channel for transmission. For example, a single 512 by 512 pixel gray-scale image with 256 gray levels requires more than 256,000 bytes of storage. A full color image requires nearly 800,000 bytes. Natural-looking motion requires that images be updated at least 30 times per second. A transmission channel for natural-looking full color moving images must therefore accommodate approximately 190 million bits per second and one minute of full color video requires almost 2 gigabytes of storage.
As a result, a number of image compression techniques have been proposed to reduce the information capacity required for storage and transmission of digital video signals. These techniques generally take advantage of the considerable redundancy in any natural image and the limits of the human psycho-visual system which does not respond to abrupt time-based or spatial transitions. Both time-domain and spatial-domain techniques are used to reduce the amount of data used to transmit, record, and reproduce color digital video images.
For example, differential pulse-code modulation (DPCM) is a commonly-used compression technique which relies upon the fact that video images, generally, are quite redundant and that any transitions in the images are, for the most part, gradual. A DPCM encoder predicts each pixel value from previous pixel values. It then compares the actual value with the predicted value to obtain an error signal. The error is the encoded value. If the predictions are relatively accurate, the error will be small and its value will occupy a great deal less memory and/or bandwidth than the original video signal. The signal can be decoded by using the prediction algorithm in conjunction with the error signal.
A color image may be represented as a combination of luminance and chrominance (color-difference) signals. For example, a digitized color image may have one byte assigned for the color-difference signal for each pixel. This image information may be compressed by recognizing that the human psycho-visual system is limited in its ability to detect subtle variations in color and therefore assigning a single chrominance value to a group of neighboring pixels which are of approximately the same color.
One approach to chrominance information compression is described in United States patent application entitled "Variable Spatial Frequency Chrominance Encoding in Software Motion Video Compression", Ser. No. 08/165,372, filed Dec. 12, 1993 by Steven M. Hancock et. al. and assigned to the assignee of the present invention, the disclosure of which is hereby incorporated by reference and referred to in the remainder of this application as "variable chrominance sub-sampling".
Briefly, the above method entails dividing a video frame into contiguous rectangular blocks, each of which is further divided into quadrants. A "weighted average" of the chrominance values within each quadrant is then computed and the average of each quadrant is compared to the averages of other quadrants within a block. If the averages are sufficiently similar, i.e. within an acceptable threshold value of one another, a single chrominance value (a weighted average of all the chrominance values within a block) is used to represent the chrominance information of the entire block. If the averages are not sufficiently similar, chrominance values are assigned on a quadrant-by-quadrant basis.
A digitized image's luminance information can also be encoded for compression, but, because the human psycho-visual system is more sensitive to luminance changes than to color changes, greater care must be taken in compressing luminance information. Luminance compression techniques are well-known and a number of such compression techniques, are discussed in detail in chapters 18 and 19 of "Television Engineering Handbook", K Blair Benson, Editor in Chief, McGraw-Hill Book Company, 1986 which is hereby incorporated by reference.
As discussed in this handbook, a commonly-employed method of luminance information compression is called block truncation coding. This method entails dividing an image into contiguous, non-overlapping regions, then encoding the luminance of each of the regions using two luminance values and a bit mask. The bit mask indicates which of two luminance values is to be assigned to a particular pixel within the region. This method does produce luminance compression, but is very sensitive to region size. If the regions are too large, the image takes on a "contoured" look, thus degrading the image quality. On the other hand, if the regions are too small, very little image compression is achieved.
Additional approaches to luminance compression are described in the United States patent application entitled "Luminance Transition Encoder For Motion Video Compression", Ser. No. 08/170,044, filed Dec. 17, 1993 by Steven M. Hancock et. al., and in the United States patent application entitled "Hybrid Video Compression System and Method Capable of Software-only Decompression in Selected Multimedia Systems", Ser. No. 07/965,580, filed Oct. 23, 1992, by Arturo Aureliano Rodriguez et. al., both of which are assigned to the assignee of the present invention and the disclosures of which are hereby incorporated by reference.
Briefly, the Luminance Transition Encoder described in Ser. No. 08/170,044 analyzes the luminance values within a region, thereby determining the direction of maximum luminance variation and the location, magnitude, and abruptness of any luminance transitions within the region. Using these factors, the region's luminance is mapped into one of a predetermined set of luminance functions which, when decoded, produce luminance distributions which approximate those commonly found in natural images.
The Hybrid Video Compression System described in Ser. No. 07/965,580 (referred to hereinafter as statistical luminance encoding), in one embodiment, decomposes an image into non-overlapping regions and computes the average and standard deviation of the frame's luminance values. The standard deviation is used to determine a "Homogeneity threshold" which is also a function of the size of the regions into which the frame is decomposed. For each region, the encoder computes the mean luminance value, the average of all the luminance values within the region that are greater than the mean, and the average of all the luminance values within the region that are less than the mean. These values are used to determine whether a region is homogeneous. If a region is homogeneous, the entire region is assigned a single color. Additionally, if the region is unchanged from the previous frame, it is encoded as unchanged. If the region has changed but is not homogeneous, it is encoded using a bit map and two colors; one color assigned to the "1" locations, the other assigned to the "0" locations.
Although the above image compression systems reduce the amount of data required to represent an image, each system may exhibit advantages over the others with any given image. Further, various combinations of the above systems may prove even more advantageous than using any one of them in isolation. But, in order to use combinations of the systems, an encoder must include, within the image data stream, information that permits a decoder to determine which type of encoding was employed. That is, in addition to providing the means for interpreting data compressed by any of the above methods, an encoder that employs all the above methods must provide a facility for recognizing which type of compression is used for any given image segment. However, when the additional recognizing information is added to the data stream, it partially offsets the efficiencies achieved by the compression schemes.
Accordingly, it is an object of the present invention to provide a coding scheme which can utilize several compression schemes.
It is another object of the present invention to provide an efficient manner of encoding the type of compression scheme used.