The present invention relates to handwriting information processing system, and more particularly to handwriting information processing system with a user interface for character segmentation.
With the quick development of computer technologies, there appear many information processing devices for accepting users"" handwriting input, such as personal digital assistants PDA or hand portable computers HPC. Users can input handwritten data and symbols into computers by means of pen-like devices. Correspondingly, there appear many handwritten characters recognition devices, which can recognize a user""s handwriting input.
The IBM""s ThinkScribe is a device integrating a handwriting digitizer with a traditional paper-based recording system. This device records a user""s handwriting input in strokes and associated timing and can reproduce the user""s handwriting input according to the original timing information. When users write Chinese characters on ThinkScribe, they usually write characters continuously with little or without any space in-between characters. And sometimes, users even overlap strokes of adjacent characters or connect the last stroke of the preceding character with the first stroke of the latter character. This makes the character segmentation a problem before recognition.
At present, there are no effective character segmentation methods, particularly for handwritten Chinese characters. The handwritten character recognition technologies can only recognize an individual Chinese character or handwritten Chinese character strings with big spaces. The difficulties of automatically segmenting handwritten Chinese character strings lie in:
1) Many Chinese characters have separable components lined up from left to right. When writing quickly in a horizontal line from left to right, the distance between such components may be similar to that between two characters. In addition to this spatial confusion, the left and right parts of those characters are often themselves single characters, or may resemble some characters. Similar statements can be made for Chinese characters written in a vertical line, since many Chinese characters have separable components stacked up from top to down.
2) For adjacent characters, when writing cursively, the end stroke of the first character and the beginning stroke of the second character may not be clearly separated with each other.
In addition, the text areas may overlap the picture areas and the handwriting lines may be not always very straight. In such a case, the automatic techniques for detecting the text areas, finding out the handwriting lines, and segmenting the individual characters in a string for recognition are not always reliable and accurate. A manual procedure is needed for such work.
Therefore, the present invention provides a handwriting information processing system, which comprises a user interface for accepting the definition of text/picture areas, handwriting lines and character boundaries from a user.
With the interface, a user can define the text/picture areas. The automatic line detecting mechanism can find out the handwriting lines by use of the information. A user can also correct errors in the automatic layout analysis with the information.
In addition, the user interface for character segmentation according to the present. invention provides an effective and natural definition handwriting line mode. By use of the information, the automatic character segmentation mechanism can find out the character boundaries and can correct errors in the automatic recognition process with the information.
The user interface for character segmentation according to the present invention provides a method for effectively defining character boundaries. By use of the information, the automatic recognition mechanism can recognize continuously written characters, and can also correct errors in automatic recognition with the information.