1. Field of the Invention
The present invention generally relates to feature-extraction methods and devices for character recognition or document-image analysis of gray-scale document images, and relates to feature-extraction schemes designed for character recognition or document-image analysis of images having a poor quality or low resolution. The present invention particularly relates to feature-extraction methods and devices which extract boundary features directly from gray-scale data without binarizing gray-scale images obtained from scanners or the like, and generate binary images based on the extracted boundaries.
2. Description of the Related Art
Prior-art techniques for character recognition or document-image analysis typically extract various features from binary images after binarizing gray-scale images obtained from image scanners by using a proper threshold value.
Gray-scale images obtained from image reading devices such as scanners end up sustaining various distortions during the time of an image reading process because of optical characteristics, as characterized by point-spread functions, of the image reading devices. This is indicated by Wang and Pavlidis (L. Wang and T. Pavlidis, "Direct Gray-Scale Extraction of Features for Character Recognition," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.15, no.10, pp.1053-1067, October 1993).
Because of these distortions, binary images obtained from gray-scale images in a straightforward manner end up losing some important information, which is a main reason for errors in recognition processes. Wang and Pavlidis maintain that importance should be placed on analysis of curved-surface structures by treating a gray-scale image as a curved surface when extracting features for the purpose of recognition.
The above publication discloses a method of extracting various "geographical" features such as ridges, ravines, and saddle points from a curved surface of a gray-scale image by employing differential-geometry operations. Ridges are lines where gray-scale values become a local maximum, and ravines are lines along which local minimums are found. Saddle points define a position which is a local maximum along a line in one direction, but is a point of a local minimum in a perpendicular direction.
The above method assumes that skeleton lines of character strokes correspond to ridges of the curved surface. Based on this assumption, this method extracts features of skeleton lines of strokes directly from gray-scale images, and the extracted features are supposed to be compatible to those obtained by thinning lines in binary images.
This method, however, has several disadvantages. There are vast resources of effective algorithms for character recognition which employ structural and statistical features of character boundaries (contours), as well as algorithms which use stroke skeletons. Unfortunately, the method of Wang and Pavlidis is not suitable for use in boundary-based recognition algorithms. Namely, this method cannot draw on valuable resources of technology which have accumulated over the years with regard to boundary-based recognition algorithms, and, thus, cannot be used as a general-purpose algorithm.
When lines are sufficiently thin, it is easy to find ridges. When lines are thick, however, the task of finding accurate positions of ridges is no longer easy. Further, differential operations are needed for detecting ridges, ravines, and saddle points, but some difficulties are expected when the differential operations are actually applied to digital images. In particular, a large number of parameters need to be determined in practical applications as in the case of solving differential equations in numerical analysis. Adjustment of these parameters is known to be a rather daunting task.
Many general-purpose algorithms are known with regard to edge detection and boundary (contour) extraction. It should be noted, however, that document images are supposed to be binary in their original form but obtained as gray-scale data because of the intervening optical reading process, and that boundaries of document elements such as characters and other symbols are always found to form a closed loop. Further, accuracy of detected edge positions is not needed as much as in analysis of architecture images or aerial photographs. Rather, boundaries need to be extracted such that features valuable for recognition are retained.
Moreover, Pavlidis, Chen, and Joseph disclose that a trade-off exists between a gray-scale level of each pixel and resolution of an image (T. Pavlidis, M. Chen, and E. Joseph, "Sampling and Quantization of Bilevel Signals," Pattern Recognition Letters, vol.14, no.7, pp.559-562, July 1993). In particular, document images do not require a large number of gray-scale levels to properly represent images when resolution is high, whereas the document images need a larger number of gray-scale levels as image resolution is lowered.
Many binarization methods determine a threshold value based on statistical characteristics of a gray-scale-value distribution. If an image has a low resolution, available data points are limited in number, and statistical estimation should suffer a decline in accuracy. In consideration of this, a method of extracting features is required so that extracted features prove to be valuable for recognition purposes even when images are provided in a low resolution.
Accordingly, there is a need for a feature-extraction method and a feature-extraction device which extract features from gray-scale document images, and extract boundaries of document elements such as characters and symbols based on the extracted features, thereby helping to utilize the vast resources of boundary-based recognition algorithms.
Further, there is a need for a feature-extraction method and a feature-extraction device which extract boundaries of document elements without use of differential operations that would bring about unstable results in digital images.
Moreover, there is a need for a feature-extraction method and a feature-extraction device which reliably extract boundaries of document elements even when gray-scale images have a low resolution.