The present invention relates to character recognizing systems, character recognizing methods and recording media, in which control programs for the same are recorded, and more particularly to optical character recognizing systems for reading characters written on paper or the like with an optical sensor.
In prior art character recognizing systems of the pertaining type, some preprocessings are executed on inputted image for the purpose of correcting variations of the size, skew, etc. of the inputted image.
Among well-known examples of preprocessing are character size normalization and skew correction. Among these preprocessings, reference line detection and correction are particularly applied to character strings of English words or like alphabet characters.
FIG. 5 is a view illustrating the definition of reference lines of a character row. The reference lines are of two different kinds, i.e., an upper and a lower reference line. The dashed and broken lines shown in superposition on the word xe2x80x9cgoodxe2x80x9d in FIG. 5 are the lower and upper reference lines, respectively. The position of the upper reference line is determined such that the upper, end of lowercase characters without ascender or descender (such as a, c, e, m, n, o, u, v, w, x and z) is found on or in the vicinity of the line. The position of the lower reference line is determined such that the lower end of lowercase characters without ascender or descender is found on or in the vicinity of the line.
Considering rectangular areas inscribing and circumscribing a character row, the area under the lower reference line is referred to as descender area. The area over the upper reference line is referred to as ascender area. And the area intervening between the upper and lower reference lines is referred to as body area.
One purpose of reference line detection and correction is as follows; Usually, the area ratios of the body, descender and ascender areas of a hand-written character row are not fixed. The descender and ascender sizes depend on writers. In other words, the area ratios of the body, descender and ascender areas vary with writers. Therefore, only with size normalization for entire character string image, the body area size variations remain, so that it is difficult to read a character row with high accuracy in the succeeding character recognizing process stage.
By detecting the reference lines and correcting the image to obtain constant area ratios (or height ratios) of the body, descender and ascender areas (for instance 1:1), normalized body, descender and ascender areas are obtainable, so that it is possible to expect an accurate character recognizing process in the succeeding stage.
The reference line detection and correction have the following second purpose. Usually, characters in a character row are rarely written in an accurate horizontal direction. In many cases, as a hand-written character row proceeds rightward, the character position is deviated vertically, and also the character size is increased and decreased. Consequently, the upper and lower reference lines fail to be horizontal and parallel. (FIG. 6 shows such an example.)
By detecting the reference lines from the inputted character row image and correcting the image to obtain horizontal reference line skew, variations of the character row skew and character size in the row can be observed, so that it is possible to expect accurate character recognition in the succeeding stage.
For the above purposes, the prior art character recognizing system has a character row reference line detecting and correcting means. The prior art described above is disclosed in Bozinovic et al, xe2x80x9cOff-Line Cursive Script Word Recognitionxe2x80x9d, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 11, No. 1, pp. 68-83, 1989.
In a reference line detecting process disclosed in this literature, a histogram of a horizontally written character row image is obtained by projecting the image horizontally and counting black pixels in each row. Then, the differences between the black pixel numbers in adjacent pixel columns are calculated, positions corresponding to the maximum and minimum differences are then selected, and horizontal straight lines containing the selected positions are made to be the reference lines. This method utilizes the fact that many black pixels are present in the body part of the image.
Another method is disclosed in Caesar et al, xe2x80x9cEstimating the Baseline for Written Materialxe2x80x9d, Proceeding of Third International Conference on Document Analysis and Recognition, 1995.
In a reference line detection process disclosed in this literature, the contour lines of a horizontally hand-written character row image are vertically divided into two parts, and the locally maximal points of the upper contour parts and the locally minimal points of the lower contour parts are all extracted. Then, by adopting the least square method, straight lines are applied as the upper and lower reference lines to the maximal points of the upper contour parts and the minimal points of the lower contour parts. This method utilizes the fact that the majority of the contour lines are located in the neighborhood of the reference lines.
As described above, the prior art techniques mostly utilize such geometric data as image projection and contour directions to detect reference lines by outputting straight lines, which are best applicable as the upper and lower reference lines.
In the above prior art character recognizing method, only positions (y-coordinates) of the upper and lower reference lines, or only positions (y-coordinates) and skews of the reference lines, are estimated under the assumption that the reference lines are straight lines which are horizontal or have a given skew.
FIG. 7 is a block diagram showing the functional constitution of a prior art example of character recognizing system. This example of character recognizing system comprises an image recording means 1, a preprocessing means 2, a character row reading means 3, a reference line detecting means 6 and a reference line correcting means 5. The reference line detecting means 6 includes a reference line position estimating means 41 and a reference line skew estimating means 42.
The image recording means 1 stores an inputted character row image. The preprocessing means 2 executes a preprocessing, such as size normalization or character skew correction of the character row image reproduced from the image recording means 1. The character row reading means 3 reads out character rows in the image preprocessed in the preprocessing means 2 by adequately executing character segmentation, character recognition, language processing, etc.
The reference line detecting means 6 receives the character row image as the subject of preprocessing from the preprocessing means 2, and estimates the positions and skews of reference lines of the character rows. In the reference line detecting means 6, the reference line position estimating means 41 estimates reference line positions (y-coordinates), and the reference line skew estimating means 42 estimates reference line skews.
The reference line correcting means 5 receives the estimated values of the reference line positions and skews from the reference line detecting means 6, and shapes the character row image by affine transformation to obtain horizontal reference lines and predetermined area ratios of the body, descender and ascender areas.
The reference line detecting means 6 projects the character row image horizontally, and produces a histogram by calculating the black pixel number for each pixel row. Specifically, the means 6 obtains the total black pixel number h(j) (j=1, . . . , N) of pixel row corresponding to y-coordinate j (i.e., vertical coordinate, being positive in downward direction of the image. N is the height of the character row image.
The reference line position estimating means 41 calculates histogram difference (h(j)xe2x88x92h(jxe2x88x921)), (j=1, . . . , Nxe2x88x921) between adjacent coordinates, and stores the value of j corresponding to the maximum difference as the position of upper reference line. The means 41 also stores the value of j corresponding to the minimum difference as the position of lower reference line.
The reference line skew estimating means 42 projects the character row image not only horizontally but also in some other directions such as rightwardly upward and rightwardly downward directions, and produces histograms each for each direction. Denoting the projection histogram deviating by xcex8 indirection from the horizontal direction by h (j, xcex8, the means 42 stores the value of xcex8 corresponding to the maximum value of histogram difference h(j, xcex8)xe2x88x92h(jxe2x88x921, xcex8) as the skew of upper reference line. The means 42 also stores the value of xcex8 corresponding to the minimum value of histogram difference h(j, xcex8)xe2x88x92h(jxe2x88x921, xcex8) as the skew of lower reference line.
The reference line correcting means 5 receives the results of the reference line detection, i.e., the upper and lower reference line position and skew data, and shapes the image to obtain horizontal upper and lower reference lines and predetermined area ratios (for instance 1:1:1) of the body, descender and ascender areas. The means feeds the shaped image data back to the preprocessing means 2.
The preprocessing means 2, receiving the reference line corrected image, executes if necessary, other preprocessings, and feeds the resultant shaped image data to the character row reading means 3. The character row reading means 3 reads out character rows from the received character row image by executing character segmentation, feature extraction, character recognition, language processing, etc., and outputs the result.
However, the positions and sizes of characters constituting hand-written character rows are not always regular. In other words, characters are not always written with their positions set on a straight line from the first to the last one of them, and some characters may be written in curved rows irrespective of the writer""s will. Also, characters in a character row may fluctuate in size. Consequently, it is often inadequate to estimate only the position and skews of the reference lines of a character row by assuming that the reference lines are straight.
FIG. 6 shows an example of character row image, to which it is difficult to apply straight lines as the reference lines. In the illustrated character row image (xe2x80x9cWilmingtonxe2x80x9d), slightly rightwardly upward reference lines are seemingly applicable to the first half part of the character row. However, for the second half part of the row it is adequate to apply substantially horizontal reference lines. When such image is inputted, it is difficult with the prior art reference line detection and correction to obtain adequate image shaping. Such non-linear character position and size variation components in a character row will be hereinafter referred to as xe2x80x9cundulationsxe2x80x9d.
When adequate shaping is not obtained as a result of the reference line detection and correction in the preprocessing stage, variations concerning the reference line variations remain without being absorbed in the succeeding stage character reading means. In such a case, stable features of image cannot be extracted by a feature extracting process, which is executed under the assumption that adequate reference line correction has been executed, thus giving rise to erroneous recognition in the character recognition. Therefore, it is necessary in the character row reference line detection and estimation in the preprocessing stage to estimate reference lines accurately and at a practical processing rate by applying curved lines to the reference lines instead of limiting the reference lines to straight lines.
Hitherto, a reference line detecting means in which curved lines are applied, has not been realized. In the second prior art example described before, the reference lines are determined by extracting locally maximal and locally minimal points of contours and applying straight lines to these points using the least square method. It is possible in a formula point of view to use curved lines instead of straight lines for calculating curved reference lines in the least square method. However, the locally maximal and locally minimal points of contours are unstable in position and not present in a large number. Therefore, it is practically impossible to utilize this method for accurately applying curved lines, although the method may be utilized for applying straight lines.
An object of the present invention, accordingly, is to solve the above problems by the provision of a character recognizing system, which is robust and operable at a practical processing rate with respect to the recognition of character row images with non-linear undulations in the position and size of character rows, as well as a character recognizing method for the same and a recording medium, in which a control program for the same is recorded.
According to an aspect of the present invention, there is provided a character recognizing system comprising image storing means for storing an inputted character row image, a preprocessing means for shaping the character row image stored in the image storing means by correcting deformations of the character row image in size, skew, etc., a reference line detecting means for detecting two, i.e. , upper and lower, reference lines characterizing the positions of the characters in a character row in the character row image received from the preprocessing means, a reference line correcting means for shaping the character row image such as to correct the upper and lower reference lines on the basis of the reference line detection result from the reference line detecting means and the character row image, and outputting the shaped character row image back to the preprocessing means, and a character row reading means for reading a character row from the character row image having been shaped in the reference line correcting means and received from the preprocessing means, the reference line detecting means including a reference line position estimating means for obtaining the positions of the reference lines, a reference line skew estimating means for obtaining the skews of the reference lines from the horizontal direction, and a reference line curvature radius estimating means for obtaining the radii of curvature of the reference lines.
According to another aspect of the present invention, there is provided a character recognizing method comprising image storing step for storing an inputted character row image, a preprocessing step for shaping the character row image stored in the image storing step by correcting deformations of the character row image in size, skew, etc., a reference line detecting step for detecting two, i.e., upper and lower, reference lines characterizing the positions of the characters in a character row in the character row image received from the preprocessing step, a reference line correcting step for shaping the character row image such as to correct the upper and lower reference lines on the basis of the reference line detection result from the reference line detecting step and the character row image, and outputting the shaped character row image back to the preprocessing step, and a character row reading step for reading a character row from the character row image having been shaped in the reference line correcting step and received from the preprocessing step, the reference line detecting step including a reference line position estimating step for obtaining the positions of the reference lines, a reference line skew estimating step for obtaining the skews of the reference lines from the horizontal direction, and a reference line curvature radius estimating step for obtaining the radii of curvature of the reference lines.
According to other aspect of the present invention, there is provided a character recognizing method comprising steps of: first shaping step for shaping an input character row image by correcting deformations of the character row image in size and skew; detecting step for detecting upper and lower reference lines characterizing positions of the characters in a character row in the character row image; second shaping step for shaping the character row such as to correct the upper and lower reference lines on the basis of the reference line detection result; reference line detecting step for obtaining the positions of the reference lines; reference line skew estimating step for obtaining the skews of the reference lines from the horizontal direction; and reference line curvature radius estimating step for obtaining the radii of curvature of the reference lines.
According to still other aspect of the present invention, there is provided a recording medium with a character recognition control program recorded the rein for causing a character recognizing system to execute character recognition by causing a reference line detecting step to obtain position of reference lines, obtain skews of the reference lines from the horizontal direction and obtain radii of curvature of the reference lines.
In summary, in the character recognizing system according to the present invention, the reference line detecting means for detecting the reference lines of a character row includes the means for estimating the radii of curvature of the reference lines in addition to the reference line position and skew estimating means.
For estimating the positions, skews and radii of curvature with less processing effort, the character recognizing system according to the present invention uses horizontal white runs of image. More specifically, the reference lines are determined such that the variance of white run length is less in each of three character row image areas, which are defined by dividing the character row image with two, i.e., upper and lower, reference lines, and that the average run length is as great as possible in the upper and lower areas and as less as possible in the central area.
Thus, it is possible to realize character row read-out, which is robust with respect to character row deformations of hand-written character rows stemming from irregularities of the character positions and sizes in the character rows, particularly robust with respect to character row deformations with non-linear character position and size variations and producing undulations.