With the recent widespread use of camera-equipped mobile terminals (hereinafter, “camera phones”), there are growing demands for recognizing or translating characters or letters in an image taken by a camera, or retrieving information based on an input result of character recognition.
For the purpose of meeting such demands, the camera phone generally has an optical character reader (OCR) incorporated therein.
Meanwhile, it is important that a mobile terminal with the OCR should have portability. Therefore, unlike a fixed terminal such as a personal computer (PC), it is necessary to downsize a printed circuit board for a memory and a central processing unit (CPU) to downsize the mobile terminal. Accordingly, hardware performance is limited in incorporating the OCR.
Because of such limitation, a simple character recognition system is used in the OCR for the mobile terminal. Typically, in an exemplary character recognition system, an average vector of each character is stored, and a distance between a feature vector of a character inputted as a recognition target and the average vector of each character stored in a character recognition dictionary is calculated. A character that has the average vector having a smallest distance from the feature vector of the input character is then regarded as a recognition result (for example, see Japanese Laid-open Patent Publication No. 05-46812).
However, the character recognition system of Japanese Laid-open Patent Publication No. 05-46812 naturally has limited character recognition accuracy.
That is, a font of a character inputted as a recognition target is not always one of fixed types of character fonts. While character fonts previously learned can be recognized with certain accuracy, satisfactory character recognition accuracy cannot be achieved when a character font that is not learned is inputted.
It is also possible to perform character recognition using an eigenvalue and an eigenvector defined by a covariance matrix, in addition to the average vector of the character, to realize high-accuracy character recognition. In this case, a character recognition dictionary that stores therein an eigenvalue and an eigenvector of each character is required. The amount of the dictionary becomes very large, and therefore the dictionary is difficult to install in the mobile terminal.
Particularly, when the characters are Kanji (Chinese characters), the total number of characters to be registered in the dictionary is about 4,000, and thus it is impractical to register eigenvalues and eigenvectors of so many characters in the dictionary of the OCR for the mobile terminal.
For this reason, when the character recognizing apparatus is incorporated in a mobile terminal, how to reduce the amount of the dictionary while achieving a high-accuracy character recognition capability becomes an issue. This issue is widely common for cases in which pattern recognition using category probability distribution (for example, facial image recognition) is performed, as well as the cases in which the character recognition is performed as pattern recognition.