The present invention relates to diacritical processing for unconstrained, on-line handwriting recognition using a forward search.
Handwriting is a process with an inherent temporal structure; it consists of a pen being driven along a trajectory in time. Such temporal information, however, gets lost when an input script is scanned from a page (i.e., xe2x80x9coff-linexe2x80x9d recognition), but temporal information is available when the input script is captured using a digitizing tablet (i.e., xe2x80x9con-linexe2x80x9d recognition). Recognition in the off-line case is difficult because it is necessary to deal with overlapping or touching characters, unintentional pen lifts and different stroke widths. Such events sometimes significantly alter the topological pattern of characters in the input script. These events, however, have little or no influence on the dynamic pattern of motion in the writing. As a result, on-line recognition systems commonly use time-based representation schemes.
Time-based representations, however, are subject to a source of misinformation as well. For instance, the letter xe2x80x9cExe2x80x9d can be written using multiple pen trajectories generating temporal variations that are not apparent in its static representation. While such variations in trajectory can be relatively large in isolated characters, it is believed that the number of variations is limited when writing naturally run-on text.
Another difficulty with time-based representations is that of delayed diacriticals. A delayed diacritical is a piece of ink used to complete a character but which does not immediately follow the last portion of that character. For example, FIG. 1 illustrates an example image for the word xe2x80x9ccityxe2x80x9d written in cursive style with diacriticals in the order in which the body, the i-dot and the t-cross of the word were written. As shown, the word xe2x80x9ccityxe2x80x9d is usually written with three ink portionsxe2x80x94the first ink portion is the main body of the word, the second ink portion is the dot for the xe2x80x9cixe2x80x9d and the third ink portion is the cross of the xe2x80x9ctxe2x80x9d. Delayed diacriticals constitute a problem for time ordered representations because all the evidence for a character may not be contiguous.
Delayed diacriticals introduce significant complexity to the design of an on-line handwriting recognizer, because processing the delayed diacriticals requires some reordering of the time domain information. As shown above in FIG. 1, it is common in cursive writing to write the body of a xe2x80x9ctxe2x80x9d, xe2x80x9cjxe2x80x9d or xe2x80x9cixe2x80x9d (and sometimes an xe2x80x9cxxe2x80x9d or xe2x80x9cfxe2x80x9d) during the writing of the word, and then to return and dot or cross the letter once the word is complete. Apostrophes may also be written at the end of the word in a similar way. A difficulty arises because the recognizer has to look ahead when scoring one of these characters to find the mark occurring later in the input script that completes the character.
Scanning the input script some distance into the future to identify potential diacriticals, remove the potential diacriticals in a preprocessing step, and associate the identified potential diacritical with corresponding characters earlier in the word is a previously proposed procedure to address the problem of processing delayed diacriticals. The identification and removal of potential diacriticals can be done as a preliminary operation (as described in L. Schomaker, xe2x80x9cUsing stroke- or character-based self organizing maps in the recognition of on-line, connected cursive scriptxe2x80x9d, Pattern Recognition, 26(3): 443-450, 1993), but this approach is error-prone because marks that are not diacriticals may be incorrectly identified and removed, and true diacriticals may be overlooked and/or skipped. A
further problem with this approach is that the procedure may end up discarding ink which is valuable for character disambiguation (e.g., between a cursive style xe2x80x9cixe2x80x9d and a cursive style xe2x80x9cexe2x80x9d or between a cursive style xe2x80x9ctxe2x80x9d and a cursive style xe2x80x9clxe2x80x9d).
Trying to-reinsert the diacriticals at a better position (as described in C. C. Tappert, xe2x80x9cA Divide-And-Conquer Cursive Script Recognizerxe2x80x9d, Research Report, IBM Watson Research Center, 1988; and P. Morasso, L. Barberis, S. Pagliano, and D. Vergano, xe2x80x9cRecognition experiments of cursive dynamic handwriting with self-organizing networksxe2x80x9d, Pattern Recognition, 26(3):451-460, 1993) is a further previously proposed procedure to address the problem of processing delayed diacriticals. One difficulty with this approach is that very often it is not obvious at what point of the word the diacritical should be linked; particularly, because these diacriticals are usually carelessly positioned. The closest point in the word may not correspond to the intended location of the diacritical.
Time-based representation schemes view an unknown input ink as a sequence of feature vectors, Y=y1,y2, . . . yT. Each of these vectors typically represents geometrical properties of a short contiguous ink fragment (i.e., strokes) as shown in FIG. 2. Pairing each of these vectors, yt, with an indication of whether or not those sections of the ink were later horizontally covered by a diacritical mark (as described in M. Schenkel, I. Guyon and D. Henderson, xe2x80x9cOn-line cursive script recognition using time-delay neural networks and hidden markov modelsxe2x80x9d, IEEE Conf. on Acoustics, Speech and Signal Processing, Australia, 1994; and G. Seni, xe2x80x9cLarge Vocabulary Recognition of On-Line Handwritten Cursive Wordsxe2x80x9d, IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(6), June, 1996) is still a further previously proposed procedure to address the problem of delayed diacriticals. The problem with this approach is that it does not allow for a strict accounting of the inkxe2x80x94that is, this approach does not ensure that every piece of ink in the input is used exactly once.
Thus, a need exists for a method and apparatus to allow trying a number of different treatments of potential diacriticals to determine whether it is better to treat a given piece of ink as a diacritical or not, directly compare the two outcomes by score and keep a strict accounting of the ink.