The present invention generally relates to binarization methods, and more particularly to a binarization method which is suited for use in pattern recognition apparatuses such as a character recognition apparatus.
Generally, an image data which is processed on a pattern recognition apparatus such as a character recognition apparatus is obtained for example by subjecting an output of a charge coupled device (CCD) of a scanner to a binarization using a threshold value. In order to make it possible to carry out an optimum binarization even with respect to a document image having a poor printing quality, there is a need to generate an optimum threshold value for the binarization for each of the different tones of the document image.
Various binarization methods have been proposed. For example, the mode method, the differential histogram method and the p-Tile method are explained in H. Tamura, "Introduction to Computer Image Processing", Soken Shuppan (publisher), 1985, pp. 66-68. The mode method obtains a histogram of tones of the given image, and when the histogram has a distribution with two peaks, the threshold value is set to a valley between the two peaks. The differential histogram method determines the threshold value by using a differentiated value of the tone of the image (that is, the rate of change of the tone) instead of directly using the tone of the image, because it can be regarded that a boundary between an object and a background in the image is a portion where the tone suddenly changes. The p-Tile method processes the image with reference to the total area of the image.
On the other hand, N. Ohtsu, "Method of Determining Threshold Value from Tone Distribution", Article No. 145, National Conference of Information Group of the Electronic Communication Society, 1977 proposes a method of determining the threshold value from a tone distribution. This method only uses the zero order and first order moments of the tone distribution and determines the optimum threshold value based on an integration.
Furthermore, an optimum binarization method is proposed in a Japanese Published Patent Application No. 60-37952. According to this system, a multi-level video signal is stored in a video buffer, and a video signal which is read out from the video buffer is binarized by a slicing circuit which has a variable slicing level. The multi-level video signal is sliced at different slicing levels and is converted into a binarized (bi-level) video signal, and a line width amplification is obtained for each of the bi-level video signals. The line width amplification is a ratio is defined as (number of black picture elements)/(number of surrounding picture elements), where the number of black picture elements are the number of black picture elements making up the character and the number of surrounding picture elements are the number of white picture elements surrounding the character. The slicing level of the slicing circuit is set based on the obtained line width amplifications and a reference line width amplification.
However, the mode method cannot be applied to a case where the document image has a poor printing quality because no clear valley exists in the histogram. In addition, the differential histogram method is ineffective with respect to a case where the tone undergoes a complex change in a vicinity of the boundary between the object and the background of the image. Furthermore, an optimum threshold value cannot be obtained according to the p-Tile method depending on the number of characters in the image, the size of each character, the complexity of the character and the like, since the p-Tile method uses the total area of the image as the reference.
The method of determining the threshold value from the tone distribution is not an effective method with respect to a smeared or thinned "line" in the image which is processed during the pattern recognition such as a character recognition.
In addition, it was found from experiments that the optimum binarization method proposed in the Japanese Published Patent Application No. 60-37952 cannot stably obtain the optimum threshold value depending on the tone of the document image.
In actual document images, the tone in many cases change in parts of the document image. For example, this change in the tone occurs when the printing quality of the document image is poor and when a shading is generated in the document image due to characteristics of an input device. According to the conventional methods, it is extremely difficult to generate an optimum bi-level image which can satisfactorily cope with the change in tone which occurs locally in the image.