In the character-recognition processing, generally individual characters are cut out from a grayscale image generated by imaging a character string, and matching processing (model matching) is performed to each cut-out character using various character models to recognize a content of the character string. In the character cutting out processing, binarized data or grayscale data of the processing target image is projected to x- and y-axis directions, a portion corresponding to the character is extracted from a projection pattern generated on each axis, thereby specifying a region (hereinafter referred to as a “character region”) corresponding to the individual characters.
In order to ensure accuracy of the matching processing, it is necessary to specify the character region in each recognition target character. However, in a composite character having a configuration in which independent character elements are arrayed in a width direction corresponding to the character string, sometimes the character elements are individually cut out to perform false matching processing.
Therefore, Patent Document 1 describes that, in the case that a character candidate having high reliability for a left-hand side of a previously-learned Chinese character is extracted, the false cutout is determined to be performed, and a character candidate suitable for the left-hand side of the Chinese character and a next character candidate are newly cut out as one character (see Paragraph No. 0033 and the like).
In the description of Patent Document 2, after the cutout of the character, tentative matching processing is performed to calculate matching reliability, a standard character length of a full-width character is decided based on the character candidate satisfying a condition that the matching reliability is greater than or equal to a predetermined reference value, all the characters of the recognition targets are cut out based on the recognized standard character length to perform final matching processing (see claim 1 and the like). Additionally, in the case that the character (for example, the Chinese characters “” produced by a combination of “” and “”) produced by a combination of two component characters is extracted in the tentative matching processing, the character is not used to decide the standard character length (see Paragraph Nos. 0015 to 0022, 0061 to 0068, 0089, and the like).
Patent Document 1: Japanese Unexamined Patent Publication No. 1997-282417
Patent Document 2: Japanese Unexamined Patent Publication No. 2010-44485