1. Field of the Invention
The invention relates generally to information signal processing, and more particularly relates to digital compression of color images for storage of the images in smart cards and databases.
2. Description of Related Art
One technique that has been used in digital encoding of image information is known as run length encoding. In this technique, the scan lines of a video image are encoded as a value or set of values of the color content of a series of pixels along with the length of the sequence of pixels having that value, set of values, or range of values. The values may be a measure of the amplitude of the video image signal, or other properties, such as luminance or chrominance. Statistical encoding of frequent color values can also be used to reduce the number of bits required to digitally encode the color image data.
One basic encoding process is based upon the block truncation coding (BTC) algorithm published by Mitchell and Delp of Purdue University in 1979. The basic BTC algorithm breaks an image into 4xc3x974 blocks of pixels and calculates the first and second sample moments. Based upon an initial quantizer, set to the first sample moment (the arithmetic mean), a selection map of those pixels lighter or darker than the quantizer is determined, along with a count of the lighter pixels. From the first and second sample moments, the sample variance, and therefore, the standard deviation, can be calculated. The mean, standard deviation, and selection map are preserved for each block. However, the original BTC method is limited to a grayscale, so that it would be desirable to adapt the BTC method to extend the BTC method to include YCrCb full-color. It would also be desirable to adapt the BTC method to handle delta values, allowing multi-level, or hierarchical, encoding and allowing encoding of differences between frames or from a specified background color.
The range of color values for any given pixel in a color image can be described, for example, as RGB color space, illustrated in FIG. 1, that can be represented by a conventional three-dimensional Cartesian coordinate system, with each axis representing 0 to 100% of the red, green and blue values for the color value of the pixel. A grayscale line can be described as running diagonally from black at 0% of each component to white at 100% of each. Since human vision can only discriminate a limited number of shades of color values, by selecting representative color values in such a color space, a limited number of color values can be used to approximate the actual color values of an image such that the human eye can not differentiate between the actual color values and the selected color values.
As is illustrated in FIGS. 2 and 3, human vision can be characterized by the Hue-Value-Saturation (HVS) color system. Hue is defined as the particular color in the visible spectrum ranging from red through green and blue to violet. Value is defined as the brightness level, ignoring color. Saturation is defined as the intensity of the particular color or the absence of other shades in the mixture. The HVS system can be represented by a generally cylindrical coordinate system with a polar base consisting of hue as the angle, and saturation as the radius. The value or brightness component is represented as the altitude above the base. The actual visible colors do not occupy the entire cylinder, but are approximately two cones, base to base with their vertices at 0% up to 100% on the value scale. The base is tilted in this example because the maximum saturation for blue occurs at a much lower brightness than the maximum saturation of green.
Referring to FIG. 2, in order to represent digitized NTSC/PAL video in a Cartesian coordinate system, the YCrCb color space is used. Because the method of the invention operates in the YCrCb color space, the method provides for a novel color space conversion method from 15- or 24-bit RGB. Eight-bit grayscale images are also supported. Referring also to FIG. 3, the chrominance components, Cr and Cb, are two axes that correspond to the polar hue and saturation components in the HVS system. The Y, or luminance, component corresponds to the brightness axis in the HVS graph. This description does not account for the slight differences between YIQ and YUV for NTSC- and PAL-based encoding, which does not form a part of the invention. The following equations can be used to convert from RGB to the YCrCb color space:
Y=0.299R+0.587G+0.114B
Cr=0.713(0.701Rxe2x88x920.587G+0.114B)
Cb=0.564(xe2x88x920.299Rxe2x88x920.587G+0.866B)
Typical implementations of digitally representing color values in this fashion use floating point arithmetic (11 multiplications and 9 additions/subtractions per pixel) or 16-bit integer arithmetic (9 multiplications, 9 additions/subtractions and 3 division per pixel). Both of these methods are quite wasteful of computing power, particularly on smaller microcontrollers. There is thus a need for a method for representing color values of digitized images that takes advantage of the limitations of human vision in discriminating color in color images in order to reduce the software and hardware requirements, particularly for storage of such color images in smart cards and databases. Smart cards are commonly approximately the same shape and size of a common credit card, and typically contain a programmable microchip, having a memory such as a read only memory, or a read/write memory. Information stored in the memory of the card can be detected by a card interface device such as a card reader or connector.
Unfortunately, noise can seriously interfere with the efficiency of any image compression process, lossy or lossless, because a compression engine must use more unnecessary data to encode noise as if it were actual subject material. Since lossy compression tends to amplify noise creating more noticeable artifacts upon decompression, lossy compression processes therefore typically attempt to remove some of the noise prior to compressing the data. Such preprocessing filters must be used very carefully, because too little filtering will not have the desired result of improved compression performance, and too much filtering will make the decompressed image cartoon-like.
Another method used for removing unwanted noise from color image data is chromakeying, which is a process of replacing a uniform color background (usually blue) from behind a subject. A common application of this process is a television weather reporter who appears to be standing in front of a map. In actuality, the reporter is standing in front of a blue wall while a computer generated map is replacing all of the blue pixels in the image being broadcast.
While preprocessing filters can remove noise from the area surrounding a subject of interest in an image, subtle changes in lighting or shading can remain in the original image which can be eliminated by chromakeying. There is thus a need to provide a chromakey method in compression of color images for storage on smart cards and databases, in order to replace the background with a solid color to increase the visual quality of the compressed image. It would also be desirable to automate and simplify the chromakey process, to simplify the chromakey process for the operator. The present invention meets these and other needs.
The present invention provides for an improved method for digitally compressing color identification photographs into 512 to 2,048 bytes for storage in inexpensive contact-less smart cards or in databases. The compression method of the invention accepts rectangular images in 16 pixel increments ranging from 48 to 256 pixels on a side. The typical size is 96xc3x9796 pixels. The method of the invention is designed for very low computational power implementations such as 8-bit microcontrollers, possibly with an ASIC accelerator.
Briefly, and in general terms, the present invention accordingly provides for a method for digital compression of a color image containing color image data consisting of a plurality of scan lines of pixels with color values, such as for a color identification photograph, for storage of the color image. The color image data is filtered by evaluating the color values of individual pixels in the color image with respect to neighboring pixels, and the color image data is statistically encoded by dividing the color image into an array of blocks of pixels, and encoding each block of pixels into a fixed number of bits that represent the pixels in the block.
In a presently preferred aspect of the method, the step of statistically encoding the color image data comprises determining a first sample moment of each block as the arithmetic mean of the pixels in the block, determining a second sample moment of the pixels in the block, and determining a selection map of those pixels in the block having color values darker or lighter than a quantizer set to the first sample moment, along with a count of the lighter pixels. Statistically encoding the color image data entails determining the sample variance and the standard deviation of the pixels in the block based upon the first and second sample moments.
In another aspect of the invention, each block is classified, quantized, and compressed by codebook compression using minimum redundancy, variable-length bit codes. The step of classifying each block comprises classifying each block according to a plurality of categories, and the step of classifying each block typically comprises classifying each of the blocks in one of four categories: null blocks exhibiting little or no change from the higher level or previous frame, uniform blocks having a standard deviation less than a predetermined threshold, uniform chroma blocks having a significant luminance component to the standard deviation, but little chrominance deviation, and pattern blocks having significant data in both luminance and chrominance standard deviations.
In a presently preferred aspect of the method, the number of bits to be preserved for each component of the block are determined after each block is classified, and the number of bits for the Y and Cr/Cb components is reduced independently for each classification. The texture map of the block is also preferably matched with one of a plurality of common pattern maps for uniform chroma and pattern classified blocks. All components of each block are preserved for pattern blocks; the mean luminance and chrominance, standard deviation luminance, and a selection map are preserved for uniform chroma blocks; and the mean values are preserved for all three color components for uniform blocks.
The step of filtering currently preferably comprises evaluating each individual pixel as a target pixel and a plurality of pixels in close proximity to the target pixel to determine an output value for the target pixel, and in a currently is preferred aspect of the method of the invention, the step of filtering comprises evaluating a sequence of five pixels, including two pixels on either side of the target pixel and the target pixel itself, for each target pixel.
In a presently preferred embodiment, the step of filtering comprises determining an average of the data for a window of the pixels immediately surrounding the target pixel for those pixels surrounding the target pixel that are within a specified range of values, according to the following protocol: if all five pixels are within the specified range, the output target pixel is determined to be the average of the four pixels in a raster line on each side of the target pixel; if the two pixels on either side are within a specified range and both sides themselves are within the range, the target pixel is determined to be impulse noise, and the filtered output target pixel data is determined to be the average of the two pixels on each side of the target pixel; if the two pixels on either side of the target pixel and the target pixel itself are within a specified range, and the other two pixels on the other side are not within the specified range, the target pixel is determined to be an edge pixel, and the output target pixel is determined to be the average of the two pixels on the matching side that fall within the specified range; if the five pixels are all increasing or decreasing, or are within a small range to account for ringing or pre-emphasis typically found in analog video signals, the target pixel is treated as being in the midst of a blurred edge, and the output target pixel is then determined to be the average of two pixels on whichever side of the target pixel is closest in value to the target pixel; and if the five pixels in the window do not fit into any of the prior cases, the output target pixel is unchanged.
A currently preferred aspect of the method of the invention further comprises replacing the background in the color image being compressed with a solid color, in order to reduce noise in the image, and to increase the visual quality of the compressed image. A significant part of replacing background in the color image being compressed with a solid color comprises calibrating an initial chromakey value and color of the background color. In one presently preferred embodiment, the step of calibrating comprises capturing at least one calibration image of the background prior to capturing an image with a subject of interest in place, consisting substantially of background pixels, and determining the average and standard deviation of the at least one calibration image to set at least an initial chromakey color and range. In another presently preferred embodiment, the step of calibrating comprises capturing an image with a subject of interest in place, and beginning in the upper-left and upper-right comers of the image, collecting pixels down and towards the center of the image until an edge or image boundary is encountered, and determining the average and standard deviation of those pixels to set at least an initial chomakey value and range. In another presently preferred embodiment, the step of calibrating comprises manually specifying an initial chromakey value and range without respect to the properties of an individual image being captured prior to image capture.
Another preferred aspect of the method of the invention comprises converting digital color image data to the YCrCb (Luminance-Chrominance) color space. In a currently preferred embodiment, this involves converting the color image data from the RGB (Red-Green-Blue) color space. In a currently preferred aspect of the method, the step of converting digital color image data to the YCrCb (Luminance-Chrominance) color space comprises utilizing lookup tables of selected color values for color space conversion, and preferably the step of converting digital color image data to the YCrCb (Luminance-Chrominance) color space comprises utilizing nine 256-entry one-byte lookup tables containing the contribution that each R, G and B make towards the Y, Cr and Cb components.
These and other aspects and advantages of the invention will become apparent from the following detailed description and the accompanying drawings, which illustrate by way of example the features of the invention.