The present invention generally relates to a method and an apparatus for recognizing characters, and more particularly, to a method and an apparatus for recognizing characters by which it is easy for users to appraise recognition results and to revise them.
In a conventional character recognition system, a pattern matching process is carried out so that recognition results are obtained. In the pattern matching process, a character pattern input from an input unit is compared with each reference pattern stored in a dictionary. Then a reference pattern having a low degree of difference from the input character pattern, or a reference pattern having a high degree of similarity to the input character pattern, is output as a recognition result with respect to the input character pattern. However, it is impossible in practice for any character recognition system to obtain a probability of 100% that a recognition result is correct. Thus, it becomes necessary for a user to appraise the recognition results and to revise them.
Conventionally, an optical character recognition apparatus has been proposed, for example, in Japanese Patent Publication No. 61-6430, in which each recognition result is displayed on a display unit and the displayed recognition result is colored in accordance with the degree of similarity of the recognition result to the input character pattern. According to this character recognition system, the user can appraise each of the recognition results based on a corresponding color given to each displayed recognition result.
Further, the following technique has also been proposed.
In a process for recognizing characters, pattern data and feature data for each input character are stored in a memory. Then, when the recognition result obtained by the process for recognizing characters is revised, a learning process for the dictionary to be used for recognizing characters is carried out based on the pattern data and/or the feature data stored in the memory. That is, the contents in the dictionary are updated based on the pattern data and/or the feature data for each input character.
Recently, the number of types and styles of characters which can be written on a document has increased, and the number of cases where characters located on a low quality copy must be recognized has also increased. Thus, it is difficult to increase the recognition rate by only improving the pattern matching process described above. As a result, post processes to be carried out after the pattern matching process have been studied, so that the number of the processes which must be carried out until the recognition result is obtained has increased, and the processes for obtaining the recognition result have become complex. The process for recognizing characters includes, for example, a getting out process for getting an image for one character out of a chain of characters, the pattern matching process, a language process for analyzing the input character pattern based on language rules, and the like.
In the conventional character recognition apparatus in which the processes are carried out in a complicated manner, it is difficult to accurately appraise the recognition result based on only the degree of similarity obtained by the pattern matching process.