Contemporary computing devices allow users to enter handwritten words (e.g., in cursive handwriting and/or printed handwritten characters) and symbols (e.g., a character in Far East languages). The words and symbols can be used as is, e.g., to function as readable notes and so forth, or can be converted to text for more conventional computer uses. To convert to text, for example, as a user writes strokes representing words or other symbols onto a touch-sensitive computer screen, digitizer, tablet PC and/or the like, a handwriting recognizer (e.g., trained with millions of samples, employing a dictionary, context and other rules) is able to convert the handwriting data into dictionary words or symbols. In this manner, users are able to enter textual data without necessarily needing a keyboard.
When dealing with typewritten input entered into a word processing program, it is relatively straightforward to implement a “find” or “search” feature as part of the program. With text, a user types in a search string and possibly enters some properties of the string, (e.g., bold typeface), and the program searches for a string in a document that exactly matches the word and any specified properties. Such a search is straightforward because typewritten input entered into a word processing program is defined by a limited set of codes, e.g., ASCII numeric values represent alphanumeric characters, and there is a limited set of properties a string can have. In general, the word processing program simply advances through the document attempting to match the full set of entered codes of the search string with a string of codes in a document in order to find an exact (allowing for any wildcards) match.
However, when entering handwritten ink, e.g., via an electronic ink processing program, it is virtually impossible for a user to write a word exactly the same way twice. Thus, searching is not possible via the simple “exact-string-match-or-not” operation. One attempted search method featurizes the electronic ink (e.g., handwritten data in the form of coordinates and other information) entered by a user, and searches through the document to find another piece of ink with similar features. This method is not very reliable, as for example, the same user can write two sets of ink, each of which is intended to be the same word, but that significantly vary from each other's features from the computer's perspective. A second method uses simple string comparison, using the translated text word that appears for any handwritten input. This second method is also relatively unreliable, because such a search depends on a recognizer making a correct translation for each translated word, despite the reality that recognizers are not one hundred percent accurate. Such inaccuracy is amplified when phrases of more than one word are searched, because known string comparison mechanisms typically translate phrases into text that is then treated as a single search unit when compared against the text of the phrase being searched, and the greater the number of words in the phrase, the greater the likelihood of a recognition error.
One other problem with conventional ink searching is that significant resources are needed and consumed to recognize the words or phrases that are being compared. Thus, if a user wants to find an ink document with a recognized search term in it, a recognizer needs to be present on the system to recognize the document (at least until a match is found) in order to determine whether the search term matches at least one ink word (or phrase) in that document. Performing such recognition is often not desirable, such as when searching for a stored document among relatively many documents.