The present invention relates generally to methods and apparatus for image processing, and specifically to methods for binarization of gray-level images.
Methods of image binarization are well-known in the art. Generally speaking, these methods take a gray-level image, in which each pixel has a corresponding multi-bit gray-level value, and convert it into a binary image, in which each pixel has a binary value, either black (foreground) or white (background). Binarization is used particularly in simplifying document images, in order to process and store information that is printed or written on the document.
The fastest and simplest binarization method is simply to fix a threshold and to determine that all pixels having a gray-level value above the threshold are white, while those below the threshold are black. This method, however, frequently results in loss or confusion of the information contained in the gray-level image. This information is embodied mainly in edges that appear in the image, and depends not so much on the absolute brightness of the pixels as on their relative brightness in relation to their neighbors. Thus, depending on the choice of threshold, a meaningful edge in the gray-level image will disappear in the binary image if the pixels on both sides of the edge are binarized to the same value. On the other hand, artifacts in the binary image with the appearance of edges may occur in an area of continuous transition in the gray-level image, when pixels with very similar gray-level values fall on opposite sides of the chosen threshold.
These problems are exemplified by the following tables. Table I represents pixel values in a 5xc3x975 image, wherein higher values represent brighter pixels:
If this image is binarized using a threshold of 85, the result will be as shown in Table II:
The large gaps surrounding the pixel in the lower right corner are represented in the binarized image, but all of the other gaps are lost. (The term xe2x80x9cgapxe2x80x9d is used in the context of the present patent application and in the claims to denote the absolute difference in gray level between a pair of neighboring pixels.)
On the other hand, if the threshold is set to 15, the resulting binary image will be as shown in Table III:
The gap of size 6 between rows 2 and 3, which probably corresponds to a real edge in the image, is represented in the binary image. The large gaps in the lower right corner are lost, however. At the same time, small gaps (of size 2) between rows 4 and 5 , which could be due to noise, are represented in the binary image. Thus, significant edges in the gray-level image are lost, while insignificant gaps are allowed to generate artifacts.
For the reasons exemplified by these tables, practical binarization algorithms allow the binarization threshold to vary. These algorithms generally make assumptions about image content in determining the best threshold to use over the whole image or in specific areas of the image. The assumptions may relate to the sizes of objects in the image, histogram properties, noise levels or other image properties. Because they are dependent on such assumptions, binarization algorithms tend to work well on the specific type of images or objects for which they are designed, but to fail on others. For example, a text-oriented binarization algorithm can work well on a document image that contains text on a plain background, but may fail when the background is textured. Furthermore, document images frequently contain salient features other than simple text, such as symbols, lines and boxes, which are important to preserve in the binary image and are lost when text-oriented binarization is used.
Image xe2x80x9ctrinarizationxe2x80x9d has been suggested as a method for processing gray-level images, although not in the context of document imaging. Typically, a range of xe2x80x9cgrayxe2x80x9d pixel values is defined intermediate the low values of the black range and the high values of the white range. The resultant trinary image has been found to be useful in a number of image recognition and image correlation applications.
For example, U.S. Pat. No. 5,067,162, whose disclosure is incorporated herein by reference, describes a method and apparatus for verifying identity using image correlation, typically based on fingerprint analysis. In order to eliminate uncertainty and variability of edge determinations in the fingerprint image, a trinarization technique is used to divide all pixels into one of three levels: black, gray or white. A histogram of gray values of the gray-scale image is determined, and black-gray and gray-white threshold values are established according to equal one-third distributions. All pixels having gray values darker than the black-gray threshold value are converted into black pixels; all pixels having gray values lighter than the gray-white threshold value are converted into white pixels; and all other pixels are ignored in subsequent correlation calculations. Thus, the black and white pixels represent with high confidence ridge and valley regions of the fingerprint image, while the gray pixels represent the transition regions between the ridges and valleys.
As another example, U.S. Pat. No. 5,715,325, whose disclosure is incorporated herein by reference, describes apparatus and methods for detecting a face in a video image. Face images are processed to eliminate fine detail and provide a hard contrast, resulting in an image that is nearly binarized (having dark blocks and light blocks) but still contains some blocks that cannot be clearly categorized. To promote simplicity in processing, the image is treated as a trinary image, wherein dark regions are identified with negative ones (xe2x88x921""s), light regions are identified with ones (1""s), and undefinable regions are identified with zeros (0""s). The trinary image is then compared with different face templates to find an optimal match.
It is an object of the present invention to provide improved methods and apparatus for image processing, and particularly for processing of document images.
It is a further object of some aspects of the present invention to provide improved methods for image binarization.
It is still a further object of some aspects of the present invention to provide a method for trinarization of an image.
In preferred embodiments of the present invention, a gray-level input image is trinarized, generally as a preparatory step in generating a binary output image. The input image is first analyzed in order to characterize variations among the gray-level values of the pixels in the image, such as gaps between the values of neighboring pixels. Based on these variations, upper and lower binarization thresholds are determined, such that pixels having gray-level values above the upper threshold are classified as white, and those below the lower threshold are classified as black. The pixels having gray-level values between the lower and upper thresholds, referred to hereinafter as intermediate or gray pixels, are then preferably processed so as to determine an optimal classification of these pixels as black or white.
Preferably, the upper and lower binarization thresholds are chosen in a manner designed to increase the number of significant edges in the input image that are preserved in the output binary image, while decreasing the number of artifact edges that occur. Generating the binary image in this manner conveys the salient features of the input image clearly, substantially without dependence on the type of image content. A range of different threshold values are evaluated against the gray-level variations among the pixels, so as to choose optimal upper and lower thresholds. Preferably, the evaluation is based on a statistical analysis of the gray-level gaps between the pixels. Alternatively or additionally, other statistical analyses and information cues, such as actual edges found by edge detection algorithms, may be used in choosing the thresholds.
In some preferred embodiments of the present invention, the intermediate pixels are classified based on their relation to other neighboring pixels. Preferably, pixels that are significantly brighter than an average of their neighbors are classified as white, while those significant darker than the average are classified as black. This classification need not depend on the chosen upper and lower thresholds. Pixels that do not differ significantly from the average of their neighbors are typically classified using a threshold, such as an average of the upper and lower thresholds.
Alternatively, other methods may be applied to classify or otherwise process the intermediate pixels. In one preferred embodiment, a text-oriented binarization algorithm is applied to the gray-level image, and the intermediate pixels are classified using the results of this algorithm. In another preferred embodiment, the gray-level values of the intermediate pixels are stored along with the binary values of the other pixels. Storing the image in this manner requires far less memory than the full gray-level image, but nearly all of the significant information in the image is preserved for use when the image is recalled for later processing or viewing by a human operator.
There is therefore provided, in accordance with a preferred embodiment of the present invention, a method for image binarization, including:
receiving a gray-level input image including a plurality of pixels having respective gray-level values;
determining a lower threshold and an upper threshold, which is greater than the lower threshold by a selected difference;
assigning a first binary value to the pixels in the gray-level image having gray-level values above the upper threshold and a second binary value to the pixels in the gray-level image having gray-level values below the lower threshold; and
processing the pixels in an intermediate group having gray-level values between the lower and upper thresholds so as to determine optimal assignments of the pixels in the intermediate group to the first and second binary values.
Preferably, determining the lower and upper thresholds includes analyzing variations among the gray-level values of the pixels in the input image and determining the thresholds responsive to the analyzed variations. Most preferably, analyzing the variations among the gray-level values includes finding edges in the input image, and determining the thresholds includes selecting the thresholds so as to preserve the edges in an output image made up of the assigned binary values.
Additionally or alternatively, analyzing the variations among the gray-level images includes finding gaps between the gray levels of neighboring pixels, and determining the thresholds includes selecting the thresholds so as to preserve the gaps that are significant in preference to the gaps that are not significant in an output image made up of the assigned binary values. Preferably, selecting the thresholds includes defining the gaps that are significant as those whose absolute magnitude is greater than the selected difference between the upper and lower thresholds. Most preferably, selecting the thresholds includes selecting the upper and lower thresholds so as to maximize a merit score computed for multiple different pairs of upper and lower thresholds, wherein the score correlates positively with the number of significant gaps preserved in the output image by the selected thresholds, and correlates negatively with the number of gaps that are not significant that are preserved and the number of significant gaps that are not preserved in the output image by the selected thresholds.
Preferably, determining the thresholds includes selecting the thresholds so as to preserve edge information in an output image made up of the assigned binary values. Most preferably, selecting the thresholds includes choosing the thresholds substantially without dependence on the type of image feature to which the information belongs. Additionally or alternatively, selecting the thresholds includes finding an optimal average value of the upper and lower thresholds and finding an optimal value of the selected difference between the thresholds.
Further preferably, processing the pixels in the intermediate group includes analyzing variations among the gray-level values of the pixels in the input image and determining the assignments of the pixels to the first and second binary values responsive to the analyzed variations. Most preferably, determining the assignments responsive to the analyzed variations includes finding a significant difference between the gray-level value of one of the pixels and the gray-level values of other pixels in its neighborhood, and assigning the pixel to the first or second binary value responsive to the difference.
In a preferred embodiment, processing the pixels in the intermediate group includes applying a binarization method optimized for text to determine the optimal assignments of the pixels in the intermediate group.
Preferably, the method includes outputting a binary image made up of the assigned binary values of the pixels.
There is also provided, in accordance with a preferred embodiment of the present invention, a method for processing a gray-level input image, which includes a plurality of pixels having respective gray-level values, the method including:
analyzing variations among the gray-level values of the pixels in the input image;
responsive to the analyzed variations, determining a lower threshold and an upper threshold, which is greater than the lower threshold by a selected gap size;
assigning a first binary value to the pixels in the gray-level image having gray-level values above the upper threshold and a second binary value to the pixels in the gray-level image having gray-level values below the lower threshold; and
generating a trinary output image, in which the pixels assigned the first and second binary values are represented by their respective binary values, and the pixels in an intermediate group having gray-level values between the lower and upper thresholds are represented by their respective gray-level values.
In a preferred embodiment, generating the trinary output image includes displaying the output image. In another preferred embodiment, generating the trinary output image includes storing the output image in a memory.
There is additionally provided, in accordance with a preferred embodiment of the present invention, apparatus for image binarization, including an image processor, which is coupled to receive a gray-level input image including a plurality of pixels having respective gray-level values, and which is adapted to determine a lower threshold and an upper threshold, which is greater than the lower threshold by a selected difference, to assign a first binary value to the pixels in the gray-level image having gray-level values above the upper threshold and a second binary value to the pixels in the gray-level image having gray-level values below the lower threshold, and to process the pixels in an intermediate group having gray-level values between the lower and upper thresholds so as to determine optimal assignments of the pixels in the intermediate group to the first and second binary values.
There is further provided, in accordance with a preferred embodiment of the present invention, apparatus for processing a gray-level input image, which includes a plurality of pixels having respective gray-level values, the apparatus including an image processor, which is adapted to analyze variations among the gray-level values of the pixels in the input image and, responsive to the analyzed variations, to determine a lower threshold and an upper threshold, which is greater than the lower threshold by a selected gap size, and to assign a first binary value to the pixels in the gray-level image having gray-level values above the upper threshold and a second binary value to the pixels in the gray-level image having gray-level values below the lower threshold, thus to generate a trinary output image, in which the pixels assigned the first and second binary values are represented by their respective binary values, and the pixels in an intermediate group having gray-level values between the lower and upper thresholds are represented by their respective gray-level values.
In a preferred embodiment, the apparatus includes a display, which is coupled to the processor so as to receive and display the trinary output image. In another preferred embodiment, the apparatus includes a storage memory, which is coupled to the processor so as to receive and store the trinary output image.
There is moreover provided, in accordance with a preferred embodiment of the present invention, a computer software product for processing an input image, including a computer-readable medium having program instructions stored therein, which instructions, when read by a computer, cause the computer to receive a gray-level input image including a plurality of pixels having respective gray-level values, to determine a lower threshold and an upper threshold, which is greater than the lower threshold by a selected difference, to assign a first binary value to the pixels in the gray-level image having gray-level values above the upper threshold and a second binary value to the pixels in the gray-level image having gray-level values below the lower threshold, and to process the pixels in an intermediate group having gray-level values between the lower and upper thresholds so as to determine optimal assignments of the pixels in the intermediate group to the first and second binary values.
There is furthermore provided, in accordance with a preferred embodiment of the present invention, a computer software product for processing a gray-level input image, which includes a plurality of pixels having respective gray-level values, the product including a computer-readable medium having program instructions stored therein, which instructions, when read by a computer, cause the computer to analyze variations among the gray-level values of the pixels in the input image and, responsive to the analyzed variations, to determine a lower threshold and an upper threshold, which is greater than the lower threshold by a selected gap size, to assign a first binary value to the pixels in the gray-level image having gray-level values above the upper threshold and a second binary value to the pixels in the gray-level image having gray-level values below the lower threshold, and to generate a trinary output image, in which the pixels assigned the first and second binary values are represented by their respective binary values, and the pixels in an intermediate group having gray-level values between the lower and upper thresholds are represented by their respective gray-level values.
The present invention will be more fully understood from the following detailed description of the preferred embodiments thereof, taken together with the drawings in which: