1. Field of the Invention
The present invention relates generally to compression of images, and more particularly to methods and devices for compressing digitized fingerprint images to a specific compression ratio.
2. Description of the Related Art
Compressing computer information is frequently a desirable way to store more data in less space. If each file or record can be made smaller without loss of data, then the cost of storage is reduced. This is particularly true when the number of files or records is very large.
There are many possible ways in which files may be compressed, but generally speaking they may be divided into two broad categories:
First, there are information compression methods (or algorithms) that preserve the original information exactly. These may be called lossless compression methods (or algorithms or encoders). For example, in a compression method called “run length encoding,” one compresses a string of digits such as “1000000200500001000000000300” down to something similar to “1X2X5X1X3X” where each “X” represents a small packet of information containing the duplicated character (in all of these examples, the duplicated character is a “0”) and also the number of times the duplicated character occurred in the original string. Later, one may decompress this information by restoring the correct number of the duplicated values at each position. Following decompression, the original image is reconstructed precisely as it was before compression.
In this example, 28 data values were compressed down to about 20 data values depending on the specifics of the implementation. Run length encoding is particularly well-suited for use in compressing data sets that contain many consecutive duplicated values, such as text or technical drawings printed in black upon a uniform white background. Other types of information, containing other types of repetitive patterns that allow for compression, should preferably be compressed using other compression schemes optimized in accordance with the nature of the repetitive patterns in the information. More generally, lossless compression schemes can automatically identify repeated patterns in the information and replace such patterns with identification tags that occupy little space. Compression methods or encoders that first identify and then replace all types of repetitive patterns with tags, in general, are called entropy encoders, since they compress information in whatever way they can to maximize the entropy (or randomness) of the information after compression. A widely used lossless image compression method is named TIFF, or “Tagged Image File Format.” The best of the lossless compression schemes can, at times, reduce the volume of information by 50% or more, depending upon how many repetitive patterns the information contains. Lossless methods of compression are mandatory when one is compressing text, computer object programs, financial databases, and the like.
Second, there are information compression methods (or algorithms) that do not preserve the original information precisely, but that lose some information. These are called lossy compression methods (or algorithms or encoders).
Lossy compression methods are particularly well suited for digitized images because it is possible to lose substantial amounts of information and still have a recognizable image. These lossy methods achieve a much higher degree of compression than can be achieved using lossless methods. The difficulty is to choose a lossy method and degree of compression which loses enough information to make the effort worthwhile and yet retains enough detail to make the image useful. In other words, a method should be selected that causes the image to appear the same to a human observer both before and after compression.
Consider the fingerprint image shown in FIG. 1. Assume that this image is a grayscale image, where every pixel is represented within a computer by a numeric value between 0 and 255, with each possible value representing a different shade of gray. (Because of the process used to print this patent, the image shown in FIG. 1 is actually a bitmap image, where every image pixel is either pure black or pure white. For purposes of this discussion, the reader is asked to assume that the pixels in FIG. 1 appear in many different shades of gray ranging from very light gray to very dark gray.)
One can compress such a gray-scale fingerprint image using a lossless compression method, such as run length encoding or TIFF, and one may thereby preserve all of the image's information content. But lossless compression cannot achieve sufficient compression to minimize storage space and transmission times adequately.
As a very simple example of how one may increase compression by permitting some unimportant information to be lost, one may proceed as follows: Just prior to carrying out such a lossless compression, one may, as a preliminary step, transform all of the light gray to white pixels falling in the regions surrounding the central fingerprint image, including any darker smudges, into pure white pixels. The information representing this image now contains many consecutive values of pure white used to represent the brightness of regions surrounding the central fingerprint image. In this simple manner, and with very little (if any) loss of useful information (from the point of view of fingerprint analysis and comparison), one thereby transforms a normal fingerprint image into a slightly modified image that is now more compressible than the original.
The image is then compressed using, for example, run length encoding (as was explained above). The run length encoding compresses all of the values representing pixel brightness in the regions surrounding the central fingerprint image down into a very small amount of information. Adopting this very simple lossy compression scheme can double the amount of compression that is achievable with virtually no loss of useful information.
Practical lossy image compression methods are in widespread use today. They work in essentially the same manner as the method just described, but they are considerably more sophisticated in the way in which they choose which information to discard.
JPEG (See ISO International Standard 10918-1, “Information Technology—Digital Compression and Coding of Continuous Tone Still Images,” “Part 1—Requirements and Guidelines”) is a lossy compression method (or algorithm or encoder) that is used by the UK's Home Office to compress fingerprint images. It is typical of such methods (or algorithms or encoders).
JPEG begins by breaking up an image into small square arrays of pixels. Each square array of pixels is represented within JPEG by a square array of numeric values each representing the grayscale values of a pixel. JPEG next performs what is called a two-dimensional cosine transformation. This is performed on all the pixel grayscale values within each square array, thus transforming each square array of pixel grayscale values into an identically-sized, corresponding square array of two-dimensional frequency values. This process is essentially analogous to performing a two-dimensional Fourier transformation upon the values.
Then, using an array of integers identical in size to the array of frequency values, and using array (or matrix) division, JPEG divides each individual frequency component by a corresponding integer value. This is done using integer division—any remainder is discarded. JPEG thereby achieves a carefully-controlled truncation of each frequency value. This is the point at which some information is lost. The truncation process forces many of the frequency values to go to zero, permitting them to be compressed through run length or entropy encoding applied as a later step. This truncation process also permits many of the remaining truncated frequency values to be represented by fewer than eight data bits, also at a later step, thereby reducing substantially the number of data bits that are needed to represent the remaining non-zero frequency components.
After arranging all of the truncated frequency values in order by frequency, to form a linear string of data values, JPEG finally carries out the compression step. First, it compresses the strings of zero values through run length encoding (or entropy encoding). Then, it uses as few data bits as possible to represent each of the truncated, non-zero data values that remain in this linear string. In this manner, JPEG in general often achieves “compression ratios” (or, more generally speaking, magnitudes of compression, however expressed) approaching or exceeding 90-to-1 without noticeable loss of image quality visible to the human eye (when an image is later decompressed). Since the truncation process is applied only to the image information when it is represented as spatial frequency values (and not when it is represented as grayscale values), the human eye has great difficulty discerning any loss of information in the decompressed image.
JPEG allows one to adjust the compression ratio, or magnitude of compression, that is achieved by JPEG. To do this, one varies the values of the integers that are contained within the integer array which JPEG uses as a divisor when truncating the frequency component values. In general, larger integers achieve more compression and less accuracy, and smaller integers achieve less compression and greater accuracy.
As an example of compression ratio (or magnitude of compression) adjustment, consider an MPEG digital television system, such as domestic satellite television, that transmits compressed digital representations of video images through channels that have fixed information transfer rates per second. MPEG digitized video encoders, which are widely used to compress and encode motion-picture and video images prior to their transmission, encode many individual frames of video using what amounts to JPEG. Thus, JPEG compression is utilized within all MPEG video encoders.
MPEG video encoders constantly monitor and adjust the average amount of compression that they are achieving. Whenever the video information per second generated by an MPEG encoder increases and threatens to exceed the available bandwidth of an information channel, the MPEG encoder adjusts upwards the integer values within the array of integers used for truncation. This increases the degree of compression and reduces the amount of information generated per second. When the information generated per second later decreases, the-encoder adjusts these same values downwards again and thereby increases the quality of the transmitted video. In this manner, the video images transmitted are kept as accurate as possible consistent with preventing the encoder's outflow of information from exceeding the channel's information per second capacity.
As this MPEG example illustrates, most lossy compression methods (or algorithms or encoders) accept as a control input some parameter or value or array. By varying this parameter, one may vary, up or down, the amount of overall compression that is achieved. This enables one actively to adjust the tradeoff that occurs between the degree of compression that is achieved and the degree of information loss and image degradation that occurs.
In the discussion which follows, any such control parameter, regardless of its form, is called a “compression parameter.” Such a parameter can be a single numeric value that, in some cases, may be expressed as a requested or desired compression ratio, such as 9-to-1; or such a parameter can be an array of several numeric values, as it is in the case of JPEG and MPEG, that permits one, by varying different values in the array, to adjust independently the amount of information loss that occurs at different spatial frequencies, for example. Regardless of whether the compression parameter is expressed as a single value or as an array, in the discussion which follows, “variation” or “adjustment” of this “compression parameter” means any variation or adjustment that affects generally the amount of information that is lost during compression and that can be varied reasonably continuously (possibly in discrete steps) over a controllable range of adjustment.
In the field of fingerprint analysis, it is clearly desirable to compress fingerprint images both to achieve compact long-term storage and also to reduce transmission time and required channel capacity. However, a person accused of a crime may be set free or sentenced to a lengthy prison term based upon fingerprint images. Accordingly, it is essential to preserve the details of each fingerprint image to insure the accuracy of the process that is carried out by a fingerprint examiner when comparing two fingerprints and by computers when searching through large databases of fingerprints.
The Federal Bureau of Investigation, in designing its databases that hold tens of millions of fingerprint images, has addressed these issues and has made decisions that have now become de facto standards for fingerprint image storage and transmission. The FBI digitizes the national fingerprint database by taking 500 pixel samples per linear inch of image, with a pixel grayscale resolution of 8 data bits (or one byte) per pixel, thus assigning a grayscale value of between 0 and 255 to each of 500 pixels in every linear inch of pixels. Each fingerprint is thus represented by in the neighborhood of 5,000,000 to 10,000,000 bits of information. A rolled finger may be represented by an array of pixels 750 high by 800 wide (1.5 by 1.6 inches); a plain thumb impression may be represented by an array of pixels 1000 high by 500 wide (2.0 by 1.0 inches). (These are the normally recommended maximum sizes for such arrays. See Section 3.9.4 of CJIS-RS-0010(V7) Electronic Fingerprint Transmission specification (Jan. 29, 1999)).
Over the years, the FBI-has investigated several different lossy compression methods and tested their use with fingerprint images. The FBI finally selected a lossy compression method named Wavelet Scalar Quantization (or WSQ). The FBI has also determined that a suitable balance (or compromise) between the cost of storage and transmission on the one hand and the preservation of image quality on the other hand is achieved by compressing each fingerprint image by 15-to-1 using the WSQ method of compression. Accordingly, an FBI standard now mandates that all criminal justice agencies throughout the country compress their fingerprint images using WSQ in such a manner as to achieve a compression ratio of 15-to-1. (For a detailed technical description of WSQ, see IAFIS-IC-0110(V3) Wavelet Sector Quantization (WSQ) Grayscale Fingerprint Image Compression Specification (Dec. 19, 1997)). Programs for performing WSQ compression are widely available commercially. In the discussion which follows, a target compression ratio, such as the FBI's designated ratio of 15-to-1, is called a “desired compression ratio.”
The problem that remains is how to achieve this desired or mandated goal of 15-to-1 compression. In practice, very few fingerprint images are actually compressed by the ratio of 15-to-1. Even the FBI itself does not achieve this compression ratio in most cases.
This is because the degree of compression achieved depends to a large measure upon the nature of the fingerprint image undergoing compression. The more repeated patterns such an image contains, the more the image may be compressed by any given compression algorithm. Fingerprint images that contain larger areas of uniformity or larger areas of repeated spatial frequency patterns are thus more compressible than are fingerprint images that are more random.
Computer implementations of the WSQ method typically permit one to specify a compression parameter that may be thought of as a desired or requested compression ratio. Of course, since the actual compression ratio achieved is always a variable function of the particular fingerprint image data supplied, the compression ratio desired or requested is not normally the ratio actually achieved when the WSQ method is actually carried out on an actual fingerprint image.
As an illustration of this, FIG. 2 presents a columnar table that shows what typically happens when one uses the WSQ method to compress 140 different fingerprint images, each time providing the same compression parameter to the WSQ encoder, requesting a 9-to-1 compression ratio. FIG. 2 reveals that 67 of the 140 images, those represented by the chart's two central bars, were compressed such that their achieved compression ratios fall within the range that extends from 14-to-1 up to just below 16-to-1. The left-most bar indicates that eight of the 140 images were compressed by about 11-to-1, while the right-most bar indicates that four of the images were compressed by about 18-to-1. This chart illustrates that there are wide variations in the compressibility of actual fingerprint images when compressed using the WSQ method, even when the compression parameter is kept fixed at the value corresponding to a requested compression ratio of 9-to-1.
Even though the majority of the 140 fingerprint images were compressed somewhat close to the FBI's standard of 15-to-1, one cannot, in general, predict how much any specific fingerprint image is going to be compressed with the WSQ method. In fact, if one were to select another 140 images and compress them in this same manner, again setting the compression parameter to 9-to-1, one would be very likely achieve a different result that might not center around 15-to-1.
Given this degree of variability, the vendors of automated fingerprint storage systems have to address the problem of achieving the FBI's desired goal of a 15-to-1 compression ratio very carefully. Using a sample, representative fingerprint image database, they must compress all the fingerprint images in the database using the WSQ algorithm with the compression parameter set to different values. They must then perform some type of statistical analysis of the results of these tests to select a compression parameter setting that maximizes the number of images compressed near to 15-to-1.
The problem with this general approach, however, is that the results achieved later on, with actual images captured by the staff of a particular criminal justice agency, may vary from the results achieved by the vendor. Accordingly, computer code must normally be built into a fingerprint storage system using WSQ compression that retains some statistical history of the compression ratios actually achieved during production use of the system. This then enables a technician, as part of regular system maintenance, to adjust the compression parameter's factory setting, thereby complicating system maintenance.
In actual field experience working with images from the equipment of various vendors, the present inventor has found that the compression parameter is frequently adjusted so that the WSQ method achieves, on the average, a compression ratio of 20-to-1 or higher. Thus, two-thirds or more of the fingerprint images may wind up having compression ratios above the FBI's mandated 15-to-1 ratio.
The above paragraphs describe what the present inventor has found to be the generally accepted practice throughout the criminal justice community. The present practice of adjusting a compression parameter to achieve average results permits many fingerprints to be over-compressed, with possible loss of essential information, while many others are under-compressed, placing increased and unnecessary demands upon storage facilities and increasing transmission times.