The present invention relates to electronic data processing, and more specifically concerns machine recognition of alphanumeric characters and similar patterns.
On-line, handprint recognition is the identification of alphanumeric characters as a user inputs to a computer, in real-time, usually with a pressure-sensitive touchpad or similar device. Character recognition of this type is becoming increasingly important in applications such as electronic pocket organizers and in-store customer-operated directories and catalogs.
On-line recognition typically captures a character in real time as it is entered, as a sequence of sampled points each having an X and a Y coordinate value. Few recognition hardware devices or software routines recognize characters directly from input samples or other direct data. Instead, they use various data-reduction techniques to reduce the typically thousands or tens of thousands of data bytes to a few hundred bytes representing typically one or several dozens of “features” representing the character. For example, the presence or absence of vertical and horizontal lines in several different regions of the character area might comprise a set of features. The presence of closed loops and open regions (“lakes and bays”) in different character areas can constitute a feature set. More abstract features such as two-dimensional Fourier-transform or wavelet coefficients have been employed as features. The features extracted from a character are then input into a recognition device or routine for identification of the pattern as belonging to one of a number of predefined output classes such as letters of the Roman alphabet and West-Arabic numerals.
Conventional on-line recognition algorithms divide each input character into a number of strokes, as the user's finger, stylus, or other instrument contacts the pad, draws a straight or curved line, and then raises the writing instrument. The total character, or “ink”, may contain from one to four or five strokes. Conventional on-line recognizers typically use the number of strokes as one of the primary features for recognizing the characters: a lower-case handprinted “c” generally has a single stroke, an uppercase “A” has three strokes . . . .
Or does it? An “A” may be drawn from upper center to lower left, then from upper center to lower right, then a crossbar from mid-left to right. Or it may be drawn with two strokes, as an inverted “V” followed by the crossbar. Or it may have only a single stroke, if the crossbar is drawn after the inverted “V” without lifting the stylus. Or it may have four or five strokes, if the stylus skips at one or more points during the input process.
Even when the stroke count can be dealt with by storing multiple variations of the same character, stroke-based recognizers have unavoidable difficulties. Previous recognizers have employed extra storage and recognition circuits for each variation, or special architectures such as time-delayed neural nets to accommodate variable-size inputs.
One way to avoid differing numbers of strokes is to require the user to write characters in a certain form. For example, a conventional pocket organizer requires each character to be entered as a single stroke in a “simplified” alphabet. However, even if such an alphabet is easy to learn, it does require some amount of practice, and it cannot be employed in an unconstrained setting such as an information display in a store.
Accordingly, there is a need for better on-line, real-time recognition of handprint characters and other patterns. Stroke-based methods have not been able to produce a simple, reliable, or inexpensive solution.