(1) Field of the Invention
The present invention generally relates to a method for classifying characters in a dictionary, and more particularly to a method for classifying characters each of which is described by structural features in dictionary.
(2) Description of Related Art
There are two types of character recognition methods. In the first type a class, to which a character belongs, is determined based on a distance between a measurement vector which is a feature vector of the input character and a pattern class vector which is a feature vector of a standard pattern in a dictionary. In the second type, the character is recognized based on structural features of the character. The second type of character recognition method may be in accordance with a human intuition. The first type of character recognition method has advantages in that a dictionary can be made automatically based on statistical decision theory and a neural network theory. However, the first type has disadvantages in that a dimension of the feature vector is generally large and the dictionary is made in a way not always in accordance with human intuition, so that it is difficult to analyze recognition errors and reasons for a rejection. The second type of character recognition method uses structural feature information regarding character shape, in which use it is in accordance with the human intuition. This method has advantages in that the character can be recognized in a manner similar to that of human beings and it is easy to analyze both recognition errors and reasons for a rejection. However in the second type it is difficult to automatically make the dictionary.
Conventionally, to eliminate the disadvantages of the second type of character recognition method, a method for automatically making a structural dictionary has been proposed, for example, in Yuzo Kato, "Method for automatically making dictionary in stroke structural solution" (Technical report of The Institute of Electronics and Communication Engineers of Japan). In this conventional method, a chain code is used as a unit for describing a feature, hence it is difficult to recognize a character which is somewhat different than a standard in local parts thereof, and a made dictionary is complex. In addition, as a degree of similarity between characters and features of character is calculated through heuristic functions, a result of the calculation includes both structural and measuring information. Thus, when the dictionary is remade with additional training data, the dictionary must be remade from the beginning. Further, as people can draw strokes of a character in various writing orders, it is difficult to describe features of one character by a class in the dictionary.