The present invention relates generally to character and pattern recognition machines and methods, and more particularly, to feature extraction systems for use with optical readers for reading characters which have been hand printed without any constraints, such as, surrounding box limits, red center lines, or similar artificial devices. One novel feature of this invention is in the method of choosing the features and the highly normalized method of measuring the individual feature parameters. The invention can be said to perform a crude simulation of a little known psychological phenomenon occuring in primates called the "saccadic flick".
While there are generally different views on the definition of the features of patterns, many studies made on the recognition of characters as well as the recognition of patterns have proved that the so-called quasi-topological features of a character or pattern such as the concavity, loop, and connectivity are very important for the recognition. To date, many different methods have been proposed for the purpose of extracting such quasi-phasic features. Up until this invention these methods all use analysis of the progressive slopes of the black pixels. Mori et al. U.S. Pat. No. 4,468,808 classifies those analyses into three types. The first is the pattern contour tracking system developed by Grenias with IBM. Mori calls this a serial system. The second type is Mori's preferred, the earliest patented example of which is Holt called the "Watchbird". In this type of analysis sequential rows and columns are compared. Another example of the sequential rows and column type is Holt's Center Referrenced Using Red Line. Mori's third type is a parallel analysis system which Mori dismisses as either taking too long or costing too much. All systems involving the sequential analysis of the slope of black pixel groups suffers severely from smoothing and line thinning errors. Worse yet, they are very likely to produce substitution errors when the lines have voids or when unwanted lines touch.
The present invention, while using quasi-topological features, employs a novel method of measuring and scoring such features, resulting in great improvement in performance of the reading machine. A comprehensive survey of prior art systems is found in an article by C. Y. Suen et al. entitled "Automatic Recognition of Handprinted Characters--The State of the Art", Proceedings of the IEEE, Vol. 68, No. 4, April 1980, which is incorporated herein by reference. The technique uses none of the methods mentioned by Suen et al. or Mori et al.
Briefly, my invention employs measurement of the enclosure characteristics of each white pixel independently of other white pixels. Since the measurements are made in two (or more) dimensions rather than in one dimension (such as slope), the results are insensitive to first order aberations such as accidental voids, touching lines and small numbers of black pixels carrying noise only. In the preferred embodiment, no noise processing is performed at all since all forms of noise processing are done at the expense of accuracy in recognition. As used herein, a pixel is defined as an image information cell constituted by the binary states "on" and "off" or "black" and "white", respectively.
The Saccadic Flick phenomenon, which occurs in primates, has the purpose of focusing various small areas of the entire retinal field of view upon the "fovea". The "fovea centralis" is a small pit or depression at the back of the retina forming the point of sharpest vision. Recent research has shown that the fovea, in addition to providing the highest resolution in the retina, more importantly provides important information processing on the visual data. In particular, it seems to "recognize" a multiplicity of general patterns or small features which it has been trained to recognize at earlier periods in its existence.