The present invention claims priority from Japanese Patent Application No. 11-165358 filed Jun. 11, 1999, the contents of which are incorporated herein by reference.
1. Field of the Invention
The present invention relates to a character recognition apparatus for automatically reading a character string from an inputted image data. Concretely, the present invention relates to an apparatus for reading such as a residence address, name or product number written on a mail and document, and to a pen-inputting apparatus for inputting a character string by a stylus pen.
2. Description of Related Art
There will be firstly explained a conventional character recognition apparatus with reference to FIGS. 9 and 10. FIG. 9 is a constitutional block diagram of essential parts of the conventional character recognition apparatus. FIG. 10 is a view for explaining how to recognize a character by the conventional technique. Herein, there will be particularly described the conventional technique concerning a character recognition apparatus aiming at reading a character string constituted of a plurality of characters, i.e., a word(s).
As shown in FIG. 9, in case of reading a character string constituted of a plurality of characters like a word such as a residence address and name as well as a product code making use of a character segmentation part 301 serving as character segmentation means for segmenting character by character and a single-character recognizing part 401 serving as character recognizing means for recognizing character by character, it is unlikely that all of the character patterns segmented at the character segmentation part 301 are precisely recognized at the single-character recognizing part 401. As such, by providing a xe2x80x9cwordxe2x80x9d dictionary 602 serving as word data storing means for storing words to be recognized, it becomes possible to improve a xe2x80x9cword-wisexe2x80x9d recognition performance, by searching, from the word dictionary 602 via word verifying part 601, that word which has the largest number of matched characters for the recognition result, even if some characters of the recognized character string are not correctly recognized at the single-character recognizing part 401. Examples of character recognition apparatus having such a constitution are disclosed in Japanese Patent Application Laid-Open No. HEI-2-109187 titled xe2x80x9cPost-Processing Method of Closely Written Addressesxe2x80x9d (hereinafter called xe2x80x9creference 1xe2x80x9d), and Japanese Patent Application Laid-Open No. HEI-5-114053 titled xe2x80x9cPost-Character-Recognition Processing Methodxe2x80x9d (hereinafter called xe2x80x9creference 2xe2x80x9d).
However, in those methods based on verification such as disclosed in the reference 1 and reference 2, erroneous correction will occur upon verification, as the inter-character contact increases in the character string to be recognized or the number of words to be recognized increases. This is because the character segmentation part 301 does not previously hypothesize the number of characters to be segmented; and particularly because a lot of candidates for the word to be recognized are enumerated from the recognition result for character candidate patterns obtained as the segmentation result when a lot of inter-character contacts are included in the original character string, resulting in difficulty in narrowing down the candidates into a correct answer.
Contrary, such as shown in F. Kimura et al., xe2x80x9cA Lexicon Directed Algorithm for Recognition of Unconstrained Handwritten Wordsxe2x80x9d, IEICE Trans. INF. and SYST., Vol.E77-D, No. 7 (1994.7) (hereinafter called xe2x80x9creference 3xe2x80x9d), there exists a recognition method in which: several words are previously hypothesized for the original word to be recognized, character candidate patterns are generated for respective hypothetical words by segmenting characters from the original word based on the respective numbers of characters included in the hypothetical words, and character recognition is individually performed for the respective character candidate patterns, to thereby resultingly decide how the recognition result is close to the hypothetical word, making use of a magnitude of word certainty level to be expressed by a sum or product of the reliability levels of the individual character recognition for each of the character candidate patterns. However, even this method has such a defect that: when there exist, for the word to be recognized, two similar hypothetical words one of which includes a xe2x80x9csingle character xe2x80x9d which is different from the corresponding xe2x80x9csingle characterxe2x80x9d of the other, these hypothetical words may not be distinguished from each other. This is because the decision is done based on the evaluation value for the whole of the character string so that occurrence of lower certainty levels of recognition result for the character string is not suitably considered.
In each of the word recognition methods disclosed in the references 1 through 3, those portions of the character candidate patterns obtained by character-segmentation, which portions are unrecognizable or which portions have lower certainty levels, are supplemented by jointly using a word information making use of the suitably recognized portions. At this time, those portions, which are unrecognizable or have a lower certainty level, i.e., which have been read-wise skipped, are not checked as to whether the characters supplemented by the word verification are really correct or not, resulting in the aforementioned misrecognition of similar words.
Meanwhile, such as disclosed in Japanese Patent No. 2734386 titled xe2x80x9cCharacter String Recognition Apparatusxe2x80x9d (hereinafter called xe2x80x9creference 4xe2x80x9d), in order to avoid misrecognition due to the aforementioned read-wise skipping, those read-wise skipped portions of the character candidate patterns are re-recognized by a method different from the initially utilized character recognition method. In this way, there can be prepared the character recognition results for all of the characters constituting the word, to thereby expel ambiguous portions therefrom, resulting in reduction of misreading. In this method, however, since the check for those read-wise skipped portions is performed by individual character recognition, it is necessary that the characters have been segmented character by character. For example, when it is intended to perform the aforementioned re-recognition on a portion where a contact of two characters has occurred, it is required that the portion has been correctly segmented into two pieces of character candidate patterns. Otherwise, i.e., when the image corresponding to two characters has not been correctly divided into two pieces of patterns, misrecognition will occur.
In the conventional examples as described above with respect to the references 1 through 4, the re-recognition as a check can not be performed, when the ambiguous portion upon jointly using the word information, i.e., the skipped portion, includes a contact of two or more of characters and such a portion has not been correctly segmented resulting in recognition of the strictly most xe2x80x9csimilarxe2x80x9d word in the word verification. As a result, misrecognition is problematically caused, such as in case of existence of another word having a different portion only which has been accidentally skipped. Similar problems are caused in those character recognition techniques as disclosed in Japanese Patent Application Laid-Open Nos. HEI-3-48379, HEI-3-154985, HEI-5-290217, HEI-7-192094 and Japanese Patent No. 2619499, in addition to the conventional examples as explained concerning the references 1 through 4.
FIG. 10 shows a concrete example thereof. In this figure, the correct answer is the word xe2x80x9cHundredxe2x80x9d. However, the recognition result becomes xe2x80x9cTh????dxe2x80x9d, in case of using only the character segmentation means and the character recognizing means, before performing the word verification. In case of performing the word verification based on this result when the recognition target is a numerical word, there is selected the word xe2x80x9cThousandxe2x80x9d as the closest word, resulting in misreading. In this respect, the aforementioned reference 4 further checks, based on another individual character recognition, the partial patterns corresponding to the portion xe2x80x9cousanxe2x80x9d as read-wise skipped by the verification. However, when the character segmentation is not correct as shown in this example, the read-wise skipped portions are not correctly recognized.
The present invention has been carried out in view of the circumstances as described above, and it is therefore an object of the present invention to provide a character recognition apparatus and a recording medium recorded with a character recognition program, capable of realizing a word recognition with higher recognition accuracy without ambiguousness by newly realizing re-recognizing means independent of instability of character segmentation even when the portion read-wise skipped by the word verification includes two or more characters. It is another object of the present invention to provide a character recognition apparatus and a recording medium recorded with a character recognition program capable of accelerating a processing speed.
According to the present invention, the character recognition apparatus is constituted to newly include: an n-fold-character recognizing part for collectively recognizing an unmatched portion without segmenting character candidate patterns character by character for an image of a read-wise skipped portion, i.e., the unmatched portion upon word verification; and an n-fold-character recognizing dictionary referred to by the n-fold-character recognizing part upon recognition; in order to realize verification independent of instability of character segmentation even when the portion read-wise skipped upon word verification includes two or more characters.
Namely, the first aspect of the present invention is a character recognition apparatus which is characterized by comprising: image storing means for inputting and storing a character string image; character segmentation means for producing character candidate patterns character by character for the character string image inputted into the image storing means, and for detecting a character-contacting portion in the character string image to thereby estimate the number of characters in the character-contacting portion; single-character recognition means for deciding character codes for the character candidate patterns character by character produced by the character segmentation means, and for outputting certainty levels of the character codes; a single-character recognizing dictionary to be used by the single-character recognition means for decision; n-fold-character recognition means for deciding character codes corresponding to n pieces of characters, when the character-contacting portion detected by the character segmentation means is estimated to include the n pieces of characters; an n-fold-character recognizing dictionary to be used by the n-fold-character recognition means for decision; and controlling means for controlling the image storing means, the character segmentation means, the single-character recognition means and the n-fold-character recognition means.
The character recognition apparatus may be constituted to further comprise: word verification means for outputting verified words in a descending order of a matching number of characters for the registered candidate words, in accordance with the single-character recognition results obtained by the single-character recognition means and the n-fold-character recognition results obtained by the n-fold-character recognition means; and a word dictionary to be used by the word verification means upon verification.
It is preferable that: the controlling means includes conducting/adopting means; and when the reliability levels outputted by said single-character recognition means for character candidate patterns character by character obtained by the character segmentation means are low and the character candidate patterns have been obtained from the character-contacting portion detected by the character segmentation means, the conducting/adopting means conducts an n-fold-character recognition operation for the character-contacting portion including the character candidate patterns, and adopts the recognition result of the thus conducted n-fold-character recognition operation.
It is also possible that the controlling means includes: adopting/stopping means: for adopting the recognition results outputted from the single-character recognition means by examining certainty levels for all of the recognition results character by character for the character string image and if all the certainty levels are larger than a predetermined value, when there exists, among the verification results obtained from the word verification means, such a verification result completely corresponding to the recognition result outputted from the single-character recognition means for the character candidate patterns character by character obtained by the character segmentation means; and for stopping the recognition for the inputted character string image, otherwise; sending means for sending partial character string images corresponding to the unmatched portions in the single-character recognition results upon word verification, toward the n-fold-character recognition means, together with the estimated number of characters and the estimated character codes of the unmatched portions, in a descending order of a matching number of characters for the triedly verified words; when there exist, among the verification results obtained from the word verification means, no verification results completely corresponding to the recognition result outputted from the single-character recognition means for the character candidate patterns character by character obtained by the character segmentation means; and means for conducting such that: the n-fold-character recognition means re-recognizes the partial character string images sent from the sending means and outputs the recognition results and certainty levels therefor; such that: when each of certainty levels for pertinent recognition results is larger than a predetermined value, each of the pertinent words being verified is kept as a final candidate, and there is calculated an evaluation value for each of the whole character strings being verified, making use of the certainty level outputted by the single-character recognition means for the portion for which the single-character recognition means has been activated and making use of the certainty level outputted by the n-fold-character recognition means for the portion for which the n-fold-character recognition means has been activated; and such that: that word, which has the largest evaluation value among the words kept as the final candidates among the verified words, is adopted as a recognition result for the inputted character string.
Moreover, it is preferable that the character segmentation means is constituted to include means for detecting a character boundary between contiguous characters upon producing the character candidate patterns character by character from the character-contacting portion, making use of a transition of run lengths of contiguous runs upon tracing black pixels in a vertical direction relative to a character string direction for black pixels of the character-contacting portion.
Alternatively, it is preferable that the character segmentation means is constituted to include means for detecting a character boundary between contiguous characters upon producing the character candidate patterns character by character from the character-contacting portion, making use of a maximum point and a minimum point in a vertical direction relative to a character string direction for a character stroke of the character-contacting portion.
Further, it is preferable that the n-fold-character recognition means is constituted to include: means: for conducting recognition for the whole of the character-contacting portion without dividing the character-contacting portion into character candidate patterns character by character; and for outputting a certainty level of the recognition result thereof; while storing a recognizing dictionary for the whole recognition, in the n-fold-character recognizing dictionary.
It is preferable that the n-fold-character recognition means is constituted to include: means: for assuming a rectangle surrounding a partial character string image corresponding to the character-contacting portion upon conducting recognition for the whole of the character-contacting portion without dividing the character-contacting portion into character candidate patterns character by character; for setting an area having a height corresponding to the height of the rectangle, and a width equal to or smaller than a single character width; for conducting recognition making use of a transition of a feature within the area, the transition being obtained for each movement of the area from one end of the rectangle toward the other end thereof; and for outputting a certainty level of the thus obtained recognition result; while storing a recognizing dictionary therefor, in the n-fold-character recognizing dictionary.
In this way, it becomes possible: to realize a word recognition with higher recognition accuracy without ambiguousness by newly realizing re-recognizing means independent of instability of character segmentation even when the portion read-wise skipped by the word verification includes two or more characters; and to accelerate a processing speed.
Namely, as a first effect of the present invention, there can be achieved a remarkable improvement in a recognition performance for a character string including character contiguity or character agglomeration. This is because, there is conducted the recognition making use of the n-fold-character recognizing part as a feature of the present invention for the unmatched portion upon word verification, to thereby achieve a correct recognition even when the unmatched portion includes contiguous characters, so that ambiguousness can be fully solved. As a result, there can be drastically suppressed misreading of character strings due to misrecognition of character candidate patterns obtained by erroneous character segmentation as mentioned in relation to the prior art.
Next, as a second effect of the present invention, acceleration of processing speed can be achieved. While the character recognition apparatus of the present invention embraces the n-fold-character recognition as its feature, there is required an extremely long period of time when the whole of the character string is recognized by the n-fold-character recognition only, as compared to a procedure in which character recognition is conducted character by character after dividing the character string into character candidate patterns character by character by character segmentation. Nonetheless, in the character recognition apparatus of the present invention, the n-fold-character recognition is effected only for the applicable portions such as when a recognition result having a lower reliability level upon single-character recognition is obtained (first embodiment) or when there exists an ambiguous portion as a result of word verification, both cases being after conducting character segmentation. Thus, there can be drastically reduced the amount of processing, as compared to a situation where the n-fold-character recognition is applied to the whole of the inputted character string. As a result, as compared to such an integrating method in which a recognition procedure based on a character segmenting operation and a recognition procedure to be collectively conducted for the whole of the character string are independently executed for an inputted character string, and the recognition results of both procedures are later integrated to thereby decide a final recognition result; the character recognition apparatus of the present invention performs the n-fold-character recognition by presuming a required minimum number of characters, and also by presuming character codes in case of word recognition, so that the recognition result can be obtained at an extremely high speed; whereas in the aforementioned integrating method, the latter recognition procedure to be collectively conducted for the whole of the character string becomes a bottleneck even if both recognition procedures are parallelly processed.
The second aspect of the present invention is a machine readable recording medium recorded with a character string recognition program to be executed on a computer, the character string recognition program being characterized by comprising the steps of: inputting and storing a character string image; producing character candidate patterns character by character for the inputted character string image, and detecting a character-contacting portion in the character string image to thereby estimate the number of characters in the character-contacting portion; deciding character codes for the produced character candidate patterns character by character, and outputting certainty levels of the character codes; when each of these certainty levels are smaller or equal to a predetermined value and involved in the character-contacting portion, deciding character codes corresponding to n pieces of characters for the character-contacting portion, and outputting a certainty level therefor, and when each of the certainty levels of the single-character recognition results and the n-fold-character recognition results is larger than a predetermined value, outputting the recognition results as a recognition result for the inputted character string.
Alternatively, the machine readable recording medium is recorded with a character string recognition program to be executed on a computer, the character string recognition program comprising: a step for inputting and storing a character string image; a step for producing character candidate patterns character by character for the inputted character string image, and detecting a character-contacting portion in the character string image to thereby estimate the number of characters in the character-contacting portion; a step for deciding character codes for the produced character candidate patterns character by character, and outputting certainty levels of the character codes; a word verification step for outputting verified words in a descending order of a matching number of characters for the registered candidate words, and outputting the unmatched portions for the respective verified words; an n-fold-character recognition step: for obtaining, from the inputted image, partial images corresponding to the unmatched portions for the respective verified words obtained from the word verification step; for deciding character codes corresponding to a plurality of characters, for which a number of characters and character codes are estimated; and for outputting certainty levels of the decided character codes; and a step for outputting, when each of the certainty levels of the single-character recognition results and the n-fold-character recognition results is larger than a predetermined value, the recognition results as a recognition result for the inputted character string.