1. Field of the Invention
The invention relates to the on-line deciphering of handwritten Chinese characters based on character shapes and, more particularly, to an on-line handwritten Chinese character recognition apparatus which refers to a character compressed code-sequence code reference file of an input method and compares character shapes by using constituted radicals for recognition.
2. Description of the Related Art
Conventional character recognition methods generally employ a template matching method, where the outline of an unknown input character is compared one at a time with previously stored character pen ink templates. The recognition result is the compared template with the greatest similarity and the least difference. This technique requires the storage of a large number of character pen ink templates in order to achieve a better recognition effect. A majority of character recognition methods use classification methods or other comparison methods in combination with the template technique with the aim of reducing the time that is spent when comparing a large number of character pen ink templates. However, there is still a need to store a large number of character pen ink templates.
The feature of the invention disclosed in R.O.C. Patent Publication No. 311201, entitled xe2x80x9cHandwritten Chinese Character Recognition System Based On Front and Rear Radicalsxe2x80x9d resides in that the front radicals and the rear radicals are employed to classify Chinese characters into three major classes, i.e., front radical plus rear remainder part of character, rear radical plus front remainder part of character, and single form character. A Chinese recognition system is then established by using this classification method. FIG. 13 illustrates a block diagram of a model board establishing portion of the system, comprising:
a Chinese character ink database 10 for storing 80 groups of Chinese character pen ink data, each group having 5401 characters written by different people;
a Chinese character ink classifier 11 for grouping Chinese characters based on the classification method to determine whether they are front radical characters, rear radical characters, or single form characters;
a radical separator 12 for separating front radical characters into a front radical portion and a rear remainder part of character, or for separating rear radical characters into a rear radical portion and a front remainder part of character;
a single form character model board generator 13 for extracting feature points of the single form characters and for storing sequentially these features in a single form character model board 16 according to the number of strokes of the single form characters;
a radical model board generator 14 for extracting features of the front radicals and the rear radicals, and for storing sequentially these features in a front radical model board 17 and in a rear radical model board 18 according to the number of strokes of the radicals; and
a remainder character model board generator 15 for extracting features of the front remainder part of characters and the rear remainder part of characters, and for storing sequentially these features in a front remainder character model board 19 and, in a rear remainder character model board 20 according to the number of strokes of the remainder part of character.
FIG. 14 illustrates a block diagram of the conventional recognition system. The recognition system comprises:
a pre-processor 21 for removing noise signals, smoothening, shift correction, rotary translation correction, size normalization and desired feature extracting processing of original pen inks;
a filter 22 for filtering selectively a possible model board using partial features of the input pen inks; a front radical comparator 23 for separating a possible front radical portion from the input pen inks, for making a detailed comparison with the sifted front radical model board, for calculating the degree of similarity therewith, and for recording the ten highest front radicals;
a rear radical comparator 24 for separating a possible rear radical portion from the input pen inks, for making a detailed comparison with the sifted rear radical model board, for calculating the degree of similarity therewith, and for recording the ten highest rear radicals;
a rear remainder character comparator 25 compares the current input strokes with those in the rear remainder character model board whose corresponding the front radical are on the first 10 list of the result of the front radical comparator 23, and then combines the degree of similarity acquired in the front radical comparator 23 operations and that in the current stage to gain the degree of similarity of the current input character;
a front remainder character comparator 26 compares the current input strokes with those in the front remainder character model board whose corresponding the rear radical are on the first 10 list of the result of the rear radical comparator 24, and then combines the degree of similarity acquired in the rear radical comparator 24 operations and that in the current stage to gain the degree of similarity of the current input character;
a single form character comparator 27 for calculating the degree of similarity between the sifted single form character model board and the input pen inks; and
a winner decider 28 for arranging the degrees of similarity after comparison and for retaining the top ten characters with the greatest degree of similarity as the recognition result.
Some of the drawbacks of the invention disclosed in R.O.C. Patent Publication No. 311201, entitled xe2x80x9cHandwritten Chinese Character Recognition System Based On Front and Rear Radicalsxe2x80x9d are as follows:
1. Many sets of Chinese character pen ink data (each set including 5401 characters that serve as recognition parties) are needed during the model board establishing stage and the character recognition stage.
2. A large model board (templates) must be pre-established, thereby requiring a large amount of time.
In view of the fact that the aforementioned template matching requires the storage of a large number of character pen ink templates that results in waste of storage space and template matching time, the object of the present invention is to provide an on-line handwriting recognition apparatus based on character shapes to reduce the storage space of character ink templates and the matching time.
In order to overcome the aforementioned drawbacks, the present invention provides an on-line handwritten Chinese character recognition apparatus having a buffer region for temporary storage of data and an output portion, characterized by comprising:
a radical template feature memorizing portion including shape features of basic radicals (i.e., base radicals) or related radicals (i.e., derived radicals) defined by an input method based on dismantling by character shapes;
an input method reference portion founded on a conventional input method based on dismantling by character shapes, and including an input method system data file for character compressed code and sequence code look-up information;
an exception character description portion for recording features of exception characters to aid a post-processing portion in deciding a final recognition result from among candidate characters;
an input portion including a digitizing tablet and a pen of a conventional on-line character handwriting equipment;
a pre-processing portion for normalization and line thinning processing of input handwritten character, and for extracting features needed in character recognition for storage in the buffer region;
a character shape dismantling portion for dismantling character based on the features of the handwritten character extracted by the pre-processing portion and with reference to the radical template feature memorizing portion so as to find the radicals that can form the handwritten character;
a comparator portion for comparing the constituted radicals found by the character shape dismantling portion with contents of the input method reference portion to get candidate characters that have difference values below a threshold value;
the post-processing portion deciding the final recognition result from among the candidate characters based on other features of the handwritten character and with reference to contents of the exception character description portion, and for transferring the final recognition result to the output portion for output.
From the foregoing construction, the on-line handwritten Chinese character recognition apparatus of this invention dismantles handwritten characters into the constituted radicals via character shape dismantling means so that a character compressed code and sequence code look-up table of a conventional input method based on dismantling by character shapes can be used directly to obtain the recognition result. The number of templates required for matching in the on-line handwritten Chinese character recognition system can be reduced, and the time for matching can be lowered.