Computers are regularly being used for a variety of purposes throughout the world. As computers have become commonplace, computer manufacturers have continuously sought to make them more accessible and user-friendly. One such effort has been the development of natural input methods, such as submitting data through handwriting. By writing with a stylus or another object onto a digitizer to produce “electronic ink,” a computer user can forego the bulk and inconvenience associated with a keyboard. Handwriting input conveniently may be used, for example, by doctors making rounds, architects on a building site, couriers delivering packages, warehouse workers walking around a warehouse, and in any situation when the use of a keyboard would be awkward or inconvenient. The use of handwriting input is particularly useful when the use of a keyboard and mouse would be inconvenient or inappropriate, such as when the writer is moving, in a quite meeting, or the like. The use of handwriting input also is the natural choice for creating some types of data, such as mathematical formulas, charts, drawings, and annotations.
While handwriting input is more convenient than keyboard input in many situations, text written in electronic ink typically cannot be directly manipulated by most software applications. Instead, text written in electronic ink must be analyzed to convert it into another form, such as ASCII characters. This analysis includes a handwriting recognition process, which recognizes characters based upon various relationships between individual electronic ink strokes making up a word of electronic ink. Handwriting recognition algorithms have improved dramatically in recent years, but their accuracy can be reduced when electronic ink is written at an angle. Likewise, when separate groups of ink strokes cannot be easily distinguished, such as when two words are written closely together, many recognition algorithms cannot accurately recognize electronic ink. Some recognition algorithms also may incorrectly recognize electronic ink as text when, in fact, the electronic ink is intended to be a drawing.
The accuracy of many recognition algorithms can be greatly improved by “parsing” (e.g., by analyzing the layout of and/or “classifying”) the electronic ink before using the handwriting recognition algorithm. A classification process typically determines whether an electronic ink stroke is part of a drawing (that is, a drawing ink stroke) or part of handwritten text (that is, a text ink stroke). Classification algorithms for identifying other stroke types also are possible. The layout analysis process typically groups electronic ink strokes into meaningful associations, such as words, lines and paragraphs.
Text lines are the most salient structures in freeform handwriting, and their reliable detection is the foundation to higher level layout analysis and semantic parsing. Freeform ink notes are a mixture of complex structures such as blocks of text, drawings, charts and annotations, and the combination of different structures often makes it difficult to reliably identify discrete lines of text in freeform handwriting. For example, FIG. 1A illustrates an example of the potential complexity of freeform handwriting 101A.
FIG. 1B illustrates just one example of the difficulty in grouping electronic ink strokes of handwritten text into lines. The handwriting 101B includes bullets 103A-103D and four groups of handwritten text 105A-105D. As will be noted by a human observer, the bullet 103A corresponds to the group of text 105A, the bullet 103B corresponds to the group of text 105B, the bullet 103C corresponds to the group of text 105C, and the bullet 103D corresponds to the group of text 105D. A human observer would also recognize that each of the groups 105A-105D should be treated as a single line of text.
The handwriting 101 may be incorrectly organized by a existing handwriting parsing technique. For example, as seen in this figure, bullets 103A-103D have erroneously been organized into a single vertical line. Also, the group of text 105A has been organized into three separate lines 109A-109C. Similarly, the group of text 105D has been organized into three separate lines 111A-111C. This erroneous recognition of the handwriting organization may make it particularly difficult for a handwriting recognition engine to correctly recognize, for example, the bullets 103A-103D.
In addition to handwriting recognition, parsing functions can be used to select handwriting text for editing and other manipulation. Accordingly, the erroneous organization of the handwriting may cause the incorrect handwriting to be selected and manipulated in a grouping that is inconvenient or even detrimental for a user.