1. Field of the Invention
This invention relates to a character recognizing system which recognizes characters by making use of character arrangement types in order to effectively and reliably recognize handwritten characters (numerals) which include attached symbols such as a bar and continuous numerals contacting each other.
2. Description of the Prior Art
As a conventional method of the character recognition, there exists, for example, the method of character recognition disclosed in Japanese Laid-Open Patent No. 233585/1989, which is proposed by the present applicant. This has provided a character recognizing method as follows. At first, data of characters are subjected to a pre-processing for smoothing and thinning and line drawing structure analysis, and the processed data are further subjected to a whole-view structural recognition in order to discriminate and extract a plurality of recognition candidate blocks each having a character, and then character recognition is carried out for each of the blocks. For the blocks in which the data could not be recognized as a character, the segmentation, the line drawing structure analysis or restructuring of the data will be effected until the recognition of the data is completed.
In the conventional method of the character recognition of handwritten characters as stated above, however, it has been impossible to finally recognize or deal with such a case, for example, of a U.S. check as shown in FIG. 1. FIG. 1 shows only a write-in portion in the check. As the handwritten characters shown embodically in FIG. 1 are progressively recognized, the characters will finally become divided into the blocks BL1 to BL8 as shown in FIG. 3. Subsequently, if each block is subjected to the character recognition, blocks BL1, BL2, BL3, BL5, BL7 and BL8 can be character-recognized but the blocks BL4 and BL6 can not be recognized as they should be and are mistakenly recognized. More specifically, the block BL4 is defectively recognized as a numeral "1" and the block BL6 is also recognized as numeral "1".
In addition, even if informations for recognizing symbols such as the cent-bar, the cent-point and the like are added to the aforementioned conventional method, it is impossible to discriminate between a numeral "1" and the cent-bar for the case of the block BL4. In the same circumstance, it is impossible to discriminate between a "1" of the numeral and the cent-bar for the case of the block BL6. As a result, in spite of the proper extraction of the blocks, if the character of a selected block has two or more kinds of possibilities, the problem has occurred that the character recognition could not be carried out.
In the case of the U.S. check as shown in FIG. 1, the format in which the handwritten characters are filled consists of the dollar-order, the cent-order, the cent-bar and the cent-mark, etc., in addition, can mostly be classified into several kinds of types, although depending upon individual variations. The fact is that the kinds of handwritten characters on the U.S. check generally belong to one of the types A to F as shown in FIGS. 4A to 4G. Further, it should be noted that this feature is not only applied to the U.S. check, but also the cases in which formats would be limited within definitive kinds.
There has conventionally been known a method shown in FIGS. 5A and 5B as the method for detecting a bar included in character information for recognizing the character information. As shown in FIG. 5A, there is written in advance a bar 100 in horizontal and linear arrangement as a reference line and characters 101 are to be written above the line. In recognizing the characters 101 written above the bar 100, the total number of dots in the horizontal direction of the characters is first counted in the vertical direction, a histogram of the number of dots in the vertical direction is prepared as shown in FIG. 5B, and then the part with extremely large number of dots is recognized as the bar 100 of the reference line. Referring to the line, the characters 101 thereabove are discriminated.
Further, one block in which a bar is in contact with a numeral generally occurs as shown in FIG. 17 showing a write-in portion of the U.S. check. It is necessary to extract the numerals by detecting the bar from the blocks in which the numeral and the bar are in contact in order to recognize the numeral information. According to the separating method described above, since a linear bar as the reference line is horizontally written in advance, it is necessarily possible to detect the numeral by forming the histogram of the horizontal dot number. However, a handwritten bar is not always horizontal and moreover the length of the bar varies in accordance with the size of the character and the bar from the histogram.
Even if the method disclosed in Japanese Laid-Open Patent No. 233585/1989 as shown in FIGS. 1 to 3 is utilized to recognize the character for the check shown in FIG. 17, it is impossible to recognize the block comprising two characters. However, it is possible to find features for classifying written forms even if the block of which the bar and the numeral are in contact is included in numeral information.
A prior art method for recognizing numeral patterns includes the systems disclosed by the present inventor, for example, in Japanese Laid-Open Patent No. 116781/1989 and Japanese Laid-Open Patent No. 116782/1989. According to these systems, there may be a case where segmentation is not realizable for continuous numerals when reading the continuous numerals in structural analysis. The method for reading collectively to recognize from preparing and registering two-continuous patterns for two-continuous numerals involves an increase in number of the patterns, and to register so many patterns beforehand does not necessarily to lead to a ready recognition in many cases.
As a method of recognizing continuous numerals, the inventor has proposed the method for separating contact characters as disclosed in Japanese Laid-Open Patent No. 121988/1989. However, there is an unavoidable defect in such that since a procedure has shifted immediately to a recognition of the remaining clock upon recognition of the numerals, a segmentation of blocks according to this method does not keep a result of the erroneous recognition, if any, from being outputted immediately.