The present invention relates to a method and apparatus for recognizing and displaying written characters. More specifically, this invention is directed to a stoke-based method for written character identification, which decomposes a character into strokes, and then uses the strokes as major features for hierarchical classification.
Optical character recognition systems are well known in the prior art. A variety of approaches for solving the particular problem of written character recognition have been proposed and experimented with, see, e.g., Ahmed, P. and Suen, C. Y., "Computer Recognition of Totally Unconstrained Written ZIP Codes", Int. J. Pattern Recognition and Artificial Intelligence 1, 1(1987), 1-15; Ali, F. and Pavlidis, T., "Syntactic Recognition of Handwritten Numerals", IEEE Trans. Syst. Man Cyber. SMC-7 (1977), 537-541; Duerr, B., Haettich, W., Tropf, H. and Winkler, G., "A Combination of Statistical and Syntactical Pattern Recognition Applied to Classification of Unconstrained Handwritten Numerals", Pattern Recognition 12 (1980), 189-199; Huang, J. S. and Chuang, K., "Heuristic Approach to Handwritten Numeral Recognition", Pattern Recognition 19, 1 (1986), 15-19 and Lam, L. and Suen, C. Y., "Structural Classification and Relaxation Matching of Totally Unconstrained Handwritten ZIP Code Numbers", Pattern Recognition 21, 1 (1988), 19-31.
Depending on the algorithms used and the quality of the tested data, diverse recognition rates have been reported. Unfortunately, prior art methods which demonstrate high recognition rates often do so at the computational expense of image processing overhead. Some methods require size normalization processing or time-consuming pixel-based thinning techniques. Other methods utilize relatively low resolution images, thereby necessitating smoothing techniques which require excessive computation time.
It is generally believed in the art that both structural and statistical approaches are necessary for constructing an integrated, reliable system to recognize totally unconstrained characters, see, e.g. Duerr, B. et al. supra. One such method, disclosed in U.S. Pat. No. 4,628,532 issued to Stone, et al., discloses a structural syntactic pattern recognition technique comprising three major steps: 1. Boundary tracing wherein the periphery of a character or image of an object is traced to determine "move" vectors or chain codes which indicate directional changes between points on the periphery; 2. Feature extraction to determine pre-defined geometrical features on the boundary of any part of an image, and 3. Classification. Although the Stone et al. patent claims high operational speed, in fact the processing overhead of the method precludes yet higher speeds while maintaining constant recognition rates. What is needed, then, is a method for written character recognition, utilizing a simple yet efficient feature extraction technique to lower processing overhead.