The present invention generally relates to character recognition methods, and more particularly to a character recognition method for recognizing a character even when the character is rotated by an arbitrary angle from a regular position.
In the present specification, the regular position or orientation of a character refers to a customary upright position of the character so that it can be read in a normal manner.
Generally, when recognizing a character, a document image is scanned and the character recognition is made with respect to each row of characters extracted from the scanned document image. For example, in the case of Japanese characters, the characters may be written from the left to right of a document 1A as shown in FIG. 1A or may be written from the top to botton of a document 1B as shown in FIG. 1B. In the case of FIG. 1B, each character is rotated 90 degrees counterclockwise compared to the corresponding character in FIG. 1A. The extracted characters may be rotated by an angle other than 90 degrees from the regular position, such as 0, 90, 180 and 270 degrees.
In other words, the document image which is scanned may contain a portion written along one direction and another portion written along another direction. FIG. 2 shows such a document 2 wherein the characters are written from the left to right and top to bottom in a portion 2a and the characters are written from the bottom to top and left to right in a portion 2b. Furthermore, a skew orientation may be generated when the document image is scanned in a state where the document is not correctly positioned, whereby the characters in the document image are rotated by a certain angle with reference to a pr determined regular position of the characters.
Conventionally, a dictionary which is used when recognizing the scanned character contains features for recognizing characters in the regular position by comparing a feature extracted from the scanned character with the features contained in the dictionary. For this reason, when the scanned character is not in the predefined regular position and is rotated by an angle from that regular position, it is impossible to recognize the scanned character by use of the dictionary, because the dictionary does not contain features of characters in positions rotated from the regular position.
Therefore, it is conceivable to provide a plurality of dictionaries and store in each dictionary the features of the characters in a position rotated by a predetermined angle from the regular position. For example, when the scanned character is rotated by 90 degrees from the regular position, the character recognition can be accomplished by use of a dictionary which contains the features of characters rotated by 90 degrees from the predefined regular position. However, in this case, a dictionary for each possible rotation angle of the characters from the regular position must be provided , and there is a problem in that a large number of dictionaries must be provided for the same characters but having different rotation angles from the regular position. As a result, an extremely large memory capacity is required by the dictionaries, this rendering this method impractical.
On the other hand, it is also contemplated that the scanned character can be rotated back to the predefined regular position before extracting the feature. In other words, when first image data is obtained by scanning the character which is not in the regular position, the first image data is subjected to a rotation process so as to obtain second image data which is representative of the character oriented in the regular position. Then, the feature of the second image data is extracted and character recognition is made by use of the dictionary which contains the features of the characters in regular predefined positions. In this case, it is unnecessary to provide a plurality of dictionaries for the same characters. However, the rotation process needed to convert the first image data into the second image data is very complex, and there is a further problem in that the speed of character recognition is significantly reduced because the rotation process takes a considerable time.