Characters carried on, for example, a document are recognized through sequential performance of the steps of extracting a character string to be subjected to character recognition, detecting the orientation of a character string to be subjected to character recognition, to thereby determine whether the character string is in an upright state or in an inverted state, and segmenting the character string into individual characters so as to perform character recognition. In order to enable recognition of characters carried on, for example, a document at high speed and with high accuracy, there has been demanded a technique for enabling performance of the above-described orientation detection processing at high speed and with high accuracy.
Conventionally, the following method has been used in order to detect the orientation (upright or inverted) of a character string. That is, first, character recognition is performed on the assumption that the character string is in an upright state; an evaluation value (number of points) in relation to the recognition result of each character is obtained, and an average or a like value of the evaluation values of the respective characters is calculated in order to obtain an overall evaluation value. Subsequently, character recognition is performed on the assumption that the character string is in an inverted state (rotated by 180 degrees); an evaluation value in relation to the recognition result of each character is obtained, and an average or a like value of the evaluation values of the respective characters is calculated in order to obtain an overall evaluation value. Subsequently, on the basis of these two overall evaluation values, character recognition which provides a higher recognition rate is specified in order to detect whether the character string is in an upright state or in an inverted state.
As described above, the conventional technique employs a configuration such that the orientation (upright or inverted) of a character string is detected through character recognition. Therefore, when the conventional technique is employed, a problem of increasing the load imposed on a CPU (Central Processing Unit) arises.
Further, the conventional technique involves a problem such that detection accuracy decreases when characters to undergo character recognition are English characters (letters) or like characters. That is, English characters include characters which assume the same (or substantially the same) shape even when rotated by 180 degrees, such as “H,” “I,” “N,” “O,” “S,” “X,” and “Z.” Further, as can be seen in the case of “M” and “W,” English characters include characters which, when rotated by 180 degrees, assume shapes very similar to those of other characters. Accordingly, due to the above-described properties, English characters provide relatively high recognition rate, even when the orientation of a character string is detected erroneously. Therefore, the conventional technique encounters a problem of the accuracy in detecting the orientation of a character string decreasing when characters to be recognized are English characters or other character having the above-described properties.
The present invention was accomplished in view of the foregoing, and an object of the present invention is to provide a character-recognition pre-processing apparatus which can detect the orientation (upright or inverted) of a character string at high speed and with high accuracy.
Another object of the present invention is to provide a character-recognition pre-processing method which can detect the orientation (upright or inverted) of a character string at high speed and with high accuracy.
Still another object of the present invention is to provide a program recording medium which stores a program used for realization of a character-recognition pre-processing apparatus which can detect the orientation (upright or inverted) of a character string at high speed and with high accuracy.