This invention relates to a method of and system for analyzing characters in connection with an optical character reader or the like adapted to extract characters from character image data and to convert them into character code data.
In the process of character recognition whereby characters are extracted from character image data, a so-called pattern matching method is frequently used whereby a candidate character is determined by calculating the distances between an input character pattern and dictionary patterns. According to one of the methods of extracting a character, line image data are extracted from character image data in an image memory and stored in a line image buffer memory and if a vacant region is detected in the direction perpendicular to the direction of the line, this vacant region is considered as the position of a boundary between two characters. With this method, however, if a character is not correctly extracted from an inputted character array, such character is not registered as a dictionary pattern and hence cannot be correctly analyzed. For example, some characters such as Chinese characters (kanji) having a left-hand radical and a right-hand radical are not continuous but can be separated into parts in horizontal direction. Some other characters are similarly separable in vertical directions. If positions of boundaries between two neighboring characters are to be recognized by detecting a vacant region as described above, the left-hand and right-hand radicals of a single character may be extracted as two separate characters.