The field of handwritten text recognition is of interest due to numerous commercial applications in offline recognition systems such as mail sorting, bank check reading and forms reading, and in online recognition systems such as touch screen input with a stylus to all types of computing systems but particularly laptop, tablet or handheld computing systems.
The main difficulties of hand written or cursive text recognition are well known—characters in the words are most often connected, and the variability of character shapes is high. There are two main strategies in the field of handwriting recognition. They are holistic recognition and analytical recognition. In holistic recognition a string of characters, such as a word or a phrase, is recognized as a whole without an individual character recognition stage in the recognition process. In analytical recognition a string of characters are first segmented into characters and then recognized character by character to recognize the word or phrase.
The main advantage of holistic recognition is that it avoids the segmentation stage and accordingly avoids segmentation mistakes. Holistic recognition of a word, for example, begins with a representation of the word created by extracting features of the cursive writing such as strokes used in the formation of portions of a character. These extracted features in the word representation are then compared against feature representations for words from a lexicon of all words in a reference vocabulary. The main disadvantage of a holistic approach is its inability to take into account a detailed character shape. This leads to significant degradation of recognition results for large size lexicons.
The main advantage of analytical recognition is the availability of well-known and highly developed character recognition techniques. However, there is a segmentation stage in the recognition process, and the problem is that erroneous segmentation decisions will lead to incorrect recognition of characters and thus the word. The segmentation algorithm can generate many incorrect variants for characters based on the portion of the character image where the segmentation decision is made. Thus, the main disadvantage of this approach is that accurate recognition depends on correct segmentation, and correct segmentation is difficult because of the variation in cursive writing styles.
It is with respect to these considerations and others that the present invention has been made.