Handwritten information may be used as computer input once it is converted to digital form. Handwritten information may be collected by any of a number of mechanisms. Typically, the tip of a pen or stylus that is held by the user is placed in contact with a tablet device that includes sensing mechanisms for detecting the position of the pen tip. Movement of the pen tip along the tablet, such as occurs when the user prints or writes, generates a stream of pen tip position data. The data may be an array of "x" and "y" position coordinates, and may be referred to as "ink" or "ink data." Handwriting recognition systems process such ink data for the purpose of transforming the user's handwritten information into digital information that can be used with any conventional computer application, such as word processing.
The primary design considerations for handwriting recognition systems are speed and accuracy. One of the factors affecting the speed with which handwriting recognition can progress relates to how computer processing time is allocated to the recognition task. For example, earlier recognition systems postponed the processing of ink data until all of the ink data was provided by the user. Such an approach failed to efficiently use processing time that was available as the computer was collecting ink data.
The speed and accuracy with which a handwriting recognition system transforms written information to corresponding digital information may suffer as a result of the order by which the ink data is processed. In some prior systems, for example, ink data was first processed to determine which alphanumeric characters were most likely formed from a series of pen movements made on the tablet. Such character identification is often referred to as segmentation processing. The likely characters were thereafter processed to determine whether or not a character was meaningful in the literal context of the other likely characters. This latter processing is referred to as context processing.
The problem with such sequential segmentation processing and context processing is that the selection of likely characters takes place without access to the context information. Put another way, segmentation processing takes place with limited information, without reference to context information that could be advantageously used to assist with the determination of the most likely characters. The accuracy of the context processing is correspondingly hindered because the context processing commences with characters that are determined by a segmentation process that is based on such limited information.
The present invention is directed to a system for increasing handwriting recognition speed and accuracy by the application of a number of innovations. For example, the present system (which will hereafter be occasionally referred to as "the recognizer") begins processing ink data as soon as the user begins to write on the tablet. Accordingly, the final recognition result is available very soon after all of the ink is collected.
As another aspect of the present invention, segmentation and context processing are integrated in order to significantly increase the accuracy of the overall recognition process. This integration is achieved by modeling a recognition task as a dynamic programming problem and constructing paths of multiple character hypotheses from the stream of ink data. The recognizer searches the multiple paths to find the optimal path, which is the string of characters most likely to match what the user wrote on the tablet. Each node of a path represents a discrete portion of the ink data. Each node has associated with it a probability or cost value that is the aggregate of the cost values assigned to the node by several modeling components. Some of the components provide cost values that relate to the probable character represented by the node. Other components provide cost values pertaining to the context of the node information. As a result, the highest probability or estimate of the correct word or sentence construction is made using, concurrently, context and segmentation processing. This segmentation/context integration greatly enhances the accuracy of the recognition system.
The dynamic programming path search (hereafter referred to as the DPP search) begins essentially as soon as the user begins to write. This search serves as a forward pass in the recognition process and is coupled with a second, reverse-pass search of the paths. The second search occurs once all of the ink data is collected. The DPP search generates the costs of the paths based upon the cost value provided by the modeling components mentioned above. These path costs are used as estimates for the second search, which is a stack-based bounded path search (hereafter referred to as the SBP search). The SBP search uses the estimates generated by the DPP search and refers to a system dictionary to add to the DPP search path estimates the probability that the word or words found in the path appear in the dictionary. The SBP search arrives at the overall optimal (most likely) recognition result. The SBP search can produce and rank a number of probable results. The results are made available to the application that calls upon the recognizer.
Other inventive aspects of this invention will become clear upon review of the following description.