Camera-based pattern recognition has received considerable attention due to a wide variety of possible applications. One of convincing applications is a “translation camera” which is a translating device integrated with a camera and a character recognition apparatus (see Non-Patent Documents 1 and 2). Another possible one is to recognize characters captured by a camera and to convert the recognized characters into a voice so as to tell them to visually impaired people. It is also considered that all patterns captured by a camera are recognized, and among the patterns, only information that has already been registered beforehand and that is required by a user is given to the user. This application is useful to a visually impaired person. There are persons who have a difficulty in finding characters in visually impaired persons. Therefore, the application that can be said to be “machine vision” is extremely useful.
In order to achieve the applications above, a practical camera-based character recognition technique which is (1) ready for real-time processing, (2) robust to geometric distortion, and (3) free from layout constraints, is required.
Firstly, the real-time processing is indispensable in order not to deteriorate the convenience of the user. As for the geometric distortions, a known technique has been realized (for example, see Non-Patent Documents 3 and 4), when the subject is limited to a character. Particularly, it has been reported that the technique in the Non-Patent Document 4 operates in real-time. In these techniques, text lines are extracted from an image captured with the use of a camera, an affine distortion, which is an approximation of projective distortion having the highest degree of freedom in distortion, is corrected, and finally, the extracted character is recognized. However, in the technique in the Non-Patent Document 4, for example, the projective distortion is corrected on a text-line basis, so that a character that does not form the text line cannot be recognized. This technique does not cope with a rotating character. Therefore, the subject illustrated in FIG. 1 cannot be recognized, which means the technique does not satisfy the requirement (3). Specifically, the technique does not satisfy the requirement of being capable of recognizing patterns of various layouts described above.
On the other hand, as a technique of satisfying the requirements (2) and (3) described above, Kusachi et al or Li et al has proposed a technique of recognizing characters one by one (for example, see Non-Patent Documents 5 and 6). However, since the technique described in the Non-Patent Documents 5 and 6 recognizes characters one by one, the problem involved with the text line is not caused, but it takes much time for the processing, resulting in that this technique cannot be said to realize the real-time processing in the requirement (1). A technique satisfying requirements (1) to (3) simultaneously has been desired.