Various handwritten pattern recognition systems are known in the art and they have varying degrees of success at recognition. These systems typically assume some particular structure of the characters (patterns) under investigation and utilize the structure to improve their recognition ability.
An example prior art system is shown in FIG. 1 to which reference is now made. It typically includes a digitizer 10, a segmenter 12, a feature extractor 14, a classifier 16 and a reference character database 18. The digitizer 10 converts an input pattern into a series of paired position (x,y) and sometimes also pressure P coordinates of sample points along the stroke. The segmenter 12 divides the input pattern into separate characters (i.e. if the input pattern was a handwritten "the", the segmenter 12 would divide the separate strokes into the characters "t", "h" and "e"). The feature extractor 14 extracts the features of each character and transforms each character into a standard format, called a "compressed model". The classifier 16 then compares the standardized input character against the standardized reference characters stored in the reference database 18. The reference character which has the best match, by some criterion or criteria, is output as the recognized character. U.S. Pat. No. 4,284,975 to Odaka and U.S. Pat. No. 4,607,386 to Morita et al. describe representative systems.
U.S. Pat. No. 4,040,009 to Kadota et al. describes a system which assumes a certain structure for the patterns being recognized and utilizes this knowledge to resolve ambiguities among characters that, from the compressed model, are indistinguishable otherwise. The classifier 16 of the system of Kadota et al. has two recognition phases. The first phase divides the reference characters into "confusion groups" where the members of each confusion group are indistinguishable from each other. In the second phase, an apriori pair-wise matrix of pair-wise specific features is created. Each pair-wise feature discriminates between a pair of reference characters based on the distance of each reference to the relevant feature. Other patents which describe this approach are U.S. Pat. Nos. 4,718,102 and 4,531,231, both to Crane et al.
Unfortunately, the criteria for recognizing confusion groups and for defining pair-wise features are based on the writing style of the particular reference characters in the database. As a result, the prior art systems cannot recognize characters which have a significantly different writing style.
U.S. Pat. No. 5,125,039 to Hawkins describes a system which records the occurrence of features in an unknown object and compares the result with dictionary entries for the reference characters. The dictionary entries indicate that, for the reference character, each feature either occurs or does not occur (i.e. they are binary features). The feature list of the unknown object is XOR'd with the feature list of each reference character and the unknown object is assigned the identity of the reference character to which it has the best XOR match.