1. Field of the Invention
The present invention relates to a character recognition apparatus for recognizing character categories of a number of input characters written by the same writer, and more specifically to a character recognition apparatus for recognizing a handwritten character with high precision without collecting character samples.
2. Description of the Related Art
There has been a conventional character recognition apparatus for optically reading a handwritten document using an image scanner, generating image data, and recognizing a handwritten character from the image data. Recently, an increasing number of handwritten character recognition apparatuses, that is, optical character reader (OCR), have been demanded as peripherals for inputting handwritten characters. A considerably high character recognition ratio is required to put the handwritten character recognition apparatus for practical use.
A character recognition apparatus for recognizing a handwritten character requires a configuration with which a character recognition ratio can be enhanced by taking the unique features of characters handwritten by a writer.
A character recognition apparatus recognizes a character category of an input character by matching the feature of an input character against the feature of the character category entered in the dictionary.
However, since the character dictionary is prepared for a general purpose, it is difficult to completely cover unique personal variations of handwritten characters, thereby lowering the character recognition ratio.
Conventionally, the correct recognition ratio has been enhanced by generating the feature of a character category entered in the dictionary based on the character samples preliminarily collected for each writer when handwritten characters are recognized, and by appropriately amending for each writer the feature of an input character when the feature of the input character is extracted based on the collected samples.
However, if a dictionary is prepared and amended appropriately for writers as in the conventional method, there arises the problem that character samples should be collected before performing a recognizing process.
There also arises another problem that a dictionary should be re-prepared by re-collecting the character samples at predetermined time intervals and amendment parameters should be re-generated so that a high character recognition ratio can be maintained because the features of handwritten characters of each writer change with time little by little.
Furthermore, when there are a number of writers whose handwritten characters are recognized, the dictionary should be prepared for each writer or amendment parameters should be appropriately managed, thereby requiring an excessively large memory capacity.
The present invention aims at providing a character recognition apparatus capable of amending a wrong character-recognition result to a correct character-recognition result.
According to the feature of the present invention, a character category to which each input character having a predetermined state belongs is discriminated.
Based on the similarity between input characters, a cluster containing input characters as elements are generated. If there are clusters different in character category and the distance between the clusters is short, then the character category of the cluster containing a smaller number of elements is amended to the character category of the cluster containing a larger number of elements.
A predetermined state of an input character refers to characters handwritten by the same writer, obscure characters, deformed characters, etc.
Thus, a character recognizing process can be correctly performed on various states without preparing a recognition dictionary for each writer or for each character quality.