1. Field of the Invention
The present invention relates in general to a method and system for converting gray scale images to binary images which employs fuzzy reasoning to calculate an optimal binarization threshold value.
2. Description of the Background Art
Conversion of gray-scale digital images to binary images is of special interest because an image in binary format can be processed with very fast logical (Boolean) operators by assigning a binary value to each of the image's pixels. A binary one value indicates that the pixel belongs to the image foreground, which may represent an object in the image, while a binary zero value indicates that the pixel is darker and belongs to the image's background. Since most image display systems and software employ gray-scale images of 8 or more bits per pixel, the binarization of these images usually takes 2 extreme gray tones, black and white, which are ordinarily represented by 0 or 255, respectively, in an 8-bit gray-scale display environment.
Image thresholding is the simplest image segmentation approach for converting a gray-scale image to a binary image. It is actually a pattern classification procedure in which only one input feature is involved, this being the pixel intensity value. Usually a binary image is obtained from an 8-bit gray-scale image by thresholding the image and assigning either the low binary value (0) or the high (255) value to all gray levels based on the chosen threshold. Obviously, the threshold that is chosen has a critical importance since it controls the binary-based pattern classification that is obtained from the gray-scale image. The key issue is to choose an optimal threshold so that the number of misclassified image pixels is kept as low as possible. Since images can differ substantially from one another depending on the objects contained therein, the optimal threshold value can vary considerably from one image to the next. Thus, merely selecting a threshold value that is, for example, set at the average pixel intensity value for the gray-scale image will probably not provide the optimal threshold. If the threshold is selected incorrectly, substantial image information will likely be lost in the conversion to binary.
Numerous techniques have been employed to address the foregoing issue. The most accurate of these are non-interactive techniques that do not require selection of any process parameters to identify the optimal threshold. Such techniques automatically select the appropriate threshold based on an analysis of each image to be converted. An example of such a technique is disclosed by N. Otsu in A Threshold Selection Method From Gray-Level Histograms. IEEE Transaction on Systems, Man, and Cybernetics, 9(1):62-66, (1979) (hereinafter referred to as the Otsu method). In the Otsu method, the optimal threshold is determined by minimizing the two variance classes; total variance and in-class variance. In other words, the means/averages of the two classes (background and foreground) should be as well separated as possible and the variances (standard deviation) in both classes should be as small as possible. The Otsu method is basically based on selecting the lowest point between the two classes.
One particularly promising non-interactive approach is to employ fuzzy reasoning to determine the optimal threshold for binarization. Fuzzy reasoning is a logical reasoning technique that attempts to mimic more accurately how the human brain reasons. Under the fuzzy reasoning approach, a logic problem becomes more than deciding whether to assign a binary one or zero to a particular bit, pixel or parameter. Fuzzy reasoning goes one step further and recognizes that there is information contained in the degree to which a given value possesses a particular characteristic. For example, there is much less certainty that a particular pixel is in the background or foreground of the image if the pixel is very near a selected intensity threshold than if the value were far below or above the threshold. In a fuzzy reasoning approach, a multiple pixel digital image is defined as an array of fuzzy singletons, each having a membership value somewhere between 0.0 and 1.0 that denotes its degree of possessing some property (e.g., brightness, darkness, edginess, blurredness, texture etc.). For image binarization, the membership function is defined in terms of the degree a pixel having a particular gray level value in the image belongs to one of the two binary classes, background and foreground.
Once the membership function is formed, the function can be employed to determine the optimum threshold value that defines the boundary between background and foreground gray levels. This is accomplished by identifying the threshold value which results in the membership function providing the minimum fuzzy entropy for the image. The concept of fuzzy entropy is generally defined in information theory as a measure of information. In the context of fuzzy reasoning, the entropy is a measure of the degree of fuzziness. Thus, in the image binarization application, the goal is to select a threshold value that results in the minimum fuzziness or uncertainty.
An example of the use of fuzzy reasoning in image binarization is the method disclosed by Huang and Wang in Image Thresholding by Minimizing the Measures of Fuzziness, Pattern Recognition, Vol. 28, No. 1, pp 41-51 (1995) (hereinafter referred to as the Huang-Wang method). In the Huang-Wang method, a triangular membership function for the foreground and background classes is employed in which the graph of the function appears as two adjacent triangles that join at a selected threshold value. The peak values of the triangles occur at the average pixel intensity level for each class, where the membership value is 1.0. To identify the optimal threshold, an iterative trial and error technique is employed to identify the threshold that results in the minimum fuzzy entropy for the membership function. Shannon's entropy function, which is a logarithmic function in the shape of a parabola, is used as an entropy factor or cost function to calculate the entropy measure for a selected threshold. The threshold value that results in the minimum fuzzy entropy is then selected as the optimal threshold for binarization of the image.
Although the Huang-Wang method is fairly accurate and selects image thresholds that in general result in preservation of more image information than more conventional techniques, this increased conversion accuracy comes at the expense of substantially more computational power and execution time. For example, In tests comparing the Huang-Wang method to the Otsu method, the Huang-Wang typically required approximately 3 times the execution time than that of the Otsu method. The extended execution time is primarily due to the logarithmic nature of Shannon's entropy function which complicates the necessary calculations. In addition, use of Shannon's function restricts the values of the membership function to a range of 0.5 to 1.0, which limits accuracy. The limited range is necessary because the parabolic shape of Shannon's function has increasing values between membership values of 0.0 and 0.5, and decreasing values between membership values of 0.5 and 1.0. However, because the cost or entropy should decrease as the membership function value increases (as the fuzziness becomes smaller), the membership values below 0.5 cannot be employed when Shannon's function is selected as the entropy measure function. As a result of the foregoing, there is a need for a fuzzy reasoning based binarization technique that can operate effectively with reduced execution times.