This application is based on an application No. H11-294149 filed in Japan, the content of which is hereby incorporated by reference.
1. Field of the Invention
The present invention relates to a character input apparatus capable of recognizing handwritten characters, a method for inputting such characters, and a computer-readable storage medium for storing programs that can recognize handwritten characters. In particular, the invention relates to a character input apparatus and a method that are suited to the recognition of characters that are drawn one after another in a single character input frame, as well as to a computer-readable storage medium storing a program that enables a computer to execute this method.
2. Description of the Prior Art
To input handwritten characters into a portable information terminal, a user handwrites characters using a stylus in a character input frame provided on a digitizer. For example, Japanese Laid-Open Patent Application No. 7-168914 discloses a technology that has a user input characters in a number of character input frames provided on a screen. These characters are transferred and recognized without the user having to make any special operations.
Another technology is xe2x80x9cGRAFFITIxe2x80x9d (a registered trademark of 3Com Corp.) for inputting special one-stroke characters.
However, when characters are written into a plurality of character input frames, a user has to move his/her hand with a stylus to a next frame every time he/she inputs a character, which takes time and makes it difficult for the user to write characters neatly. This prevents a user from inputting handwritten characters efficiently.
A large space is required to provide a plurality of character input frames, which makes it difficult to realize this method on a compact apparatus such as a portable information terminal or a portable telephone.
While it would be ideal for compact portable information terminals to use a tablet as a single character input frame, it is difficult to judge when there is a break between inputted characters. In order to clarify the breaks, the conventional way has been to equip an apparatus with a conversion button that a user presses after drawing each character. This prevents the user from writing naturally.
When characters are written using special one-stroke characters with xe2x80x9cGRAFFITIxe2x80x9d, users have to learn a special writing system, which provides an obstacle for anyone wishing to use the apparatus.
In view of the above problems, the present invention aims to provide a character input apparatus and method for realizing accurate recognition of inputted handwritten characters that are drawn one after another in a single character input frame by a user, without requiring the user to learn a special writing system, and a computer-readable storage medium for storing a program that makes a computer execute this method.
The above object can be achieved by a character input apparatus comprising: a stroke dictionary in which sets of standard stroke information for a plurality of strokes are registered, each set of standard stroke information corresponding to a different stroke; a character dictionary in which stroke orders for a plurality of characters are registered, each stroke order corresponding to a different character; a coordinate output unit operable to output, when a user inputs handwritten characters, a coordinate string, that is sets of coordinates of points, for each handwritten stroke composing the handwritten characters; a stroke matching unit operable to obtain stroke information for each handwritten stroke from the outputted coordinate string of the handwritten stroke, compare the stroke information with each set of standard stroke information registered in the stroke dictionary, and output stroke candidates; and a character detecting unit operable to search the character dictionary using an order of a group of stroke candidates obtained in inputted order by the stroke matching unit as a key and detect a character whose stroke order matches the order of the stroke candidates. With this construction, the apparatus is capable of detecting correct characters for handwritten characters drawn one after another in a single character input frame.
Here, the standard stroke information includes a start position, an end position, each direction, and number of directions for each stroke, and wherein the stroke matching unit comprises: a stroke information obtaining unit operable to obtain stroke information which includes a start position, an end position, each direction, and number of directions for a stroke from the coordinate string of the stroke; a stroke candidate output unit operable to compare the obtained stroke information with each set of standard stroke information and output a plurality of stroke candidates; and an evaluation value adding unit operable to add an evaluation value to each stroke candidate according to a degree of matching between the obtained stroke information and the standard stroke information of the stroke candidate, and wherein the character detecting unit comprises: a character candidate detecting unit operable to ignore stroke candidates with the evaluation value showing a poor match and detect character candidates using the remaining stroke candidates; and a correct character detecting unit operable to detect a correct character out of the detected character candidates. With this construction, stroke candidates with low recognition level are ignored in a process of detecting character candidates, which increases efficiency for the detecting of character candidates.
Here, the character input apparatus further comprises, a coordinate string detecting unit operable to detect coordinate strings, out of the coordinate strings outputted by the coordinate output unit, that are inputted at least a predetermined time after an immediately preceding coordinate string, wherein the character candidate detecting unit detects character candidates by setting the stroke candidates corresponding to the coordinate strings detected by the coordinate string detecting unit as first strokes of characters. With this construction, recognition mistake of a stroke of a handwritten character for a stroke that constitutes a different character is prevented.
Here, the correct character detecting unit in the character input apparatus comprises a word dictionary in which character orders for a plurality of words are registered, each character order corresponding to a different word; and a word detecting unit operable to detect, when (a) a character candidate detected by the character candidate detecting unit or (b) a combination of character candidates is registered in the word dictionary, the corresponding word as inputted handwritten characters. With this construction, the apparatus is capable of recognizing inputted handwritten characters as correct words in the word dictionary.
Here, the correct character detecting unit in the character input apparatus comprises: a probability dictionary for showing a numeric value of the probability of each pair of adjacent characters appearing in a character string; a high probability character detecting unit operable to generate character strings by combining character candidates detected by the character candidate detecting unit without changing an order of the character candidates, calculate a value for each character string by summing up the numeric values of the probability for pairs of adjacent characters that constitute a character string and dividing the sum by the number of characters that constitute the character string, and detect a character string with the highest value as inputted handwritten characters. With this construction, handwritten characters are recognized based on the probability of characters appearing before/after other characters. Therefore, the apparatus is also capable of recognizing characters that are not registered in the word dictionary.
Here, the mode of the character input apparatus can be set to learning mode for learning handwritten characters, wherein when the character input apparatus is in learning mode, the stroke matching unit adds an evaluation value to each stroke candidate according to a degree of matching between the obtained stroke information and the standard stroke information of the stroke candidate registered in the stroke dictionary, and wherein the character input apparatus further includes: a stroke dictionary additional register unit operable to additionally register, when an evaluation value for a stroke candidate is smaller than a predetermined evaluation value which shows a poor match, stroke information for the stroke candidate as standard stroke information in the stroke dictionary; and a character dictionary additional register unit operable to additionally register a stroke order that includes the above stroke candidate for a character to be learned in the character dictionary. With this construction, the apparatus is capable of accurately recognizing a user""s unique handwritten characters.
Here, the stroke dictionary stores standard stroke information together with a flag that indicates whether each stroke has the possibility of being a first stroke of a character, and the stroke matching unit compares stroke information obtained from a first coordinate string outputted by the coordinate output unit only with stroke information having a flag that indicates a first stroke in the stroke dictionary. With this construction, the matching of pairs of stroke information can be performed with high speed and accuracy.
Here, the character input apparatus further comprises, a coordinate string detecting unit operable to detect coordinate strings, out of the coordinate strings outputted by the coordinate output unit, that are inputted at least a predetermined time after an immediately preceding coordinate string, wherein the stroke dictionary stores standard stroke information together with a flag that indicates whether each stroke has the possibility of being a first stroke of a character, and the stroke matching unit compares stroke information obtained from a first coordinate string outputted by the coordinate output unit only with stroke information having a flag that indicates a first stroke in the stroke dictionary. With this construction, the matching of pairs of stroke information can be performed with high speed and accuracy.
Here, the stroke matching unit comprises: an overlap detecting unit operable to detect from a coordinate string outputted by the coordinate output unit whether a stroke has a line segment that partially or completely overlaps a stroke preceding the stroke, and if the overlap detecting unit detects overlapping strokes, the character detecting unit treats the latter of the overlapping strokes belonging to a different character to the former stroke. With this construction, the apparatus is capable of choosing stroke candidates for each of handwritten characters drawn one after another in a single character input frame without any mistakes, which prevents the apparatus from recognizing characters wrongly and realizes high speed character recognition.
Here, the character input apparatus further comprises, an off-stroke information detecting unit operable to detect off-stroke information from the end point of one coordinate string and the start point of a following coordinate string outputted by the coordinate output unit, wherein the character dictionary further stores, when a character is composed of a plurality of strokes, off-stroke information that shows the relationship between the end point of one stroke and the start point of a following stroke, and wherein when the character detecting unit detects characters from stroke candidates outputted by the stroke matching unit, if off-stroke information detected by the off-stroke information detecting unit differs from the corresponding off-stroke information of a character registered in the character dictionary by at least a predetermined amount, the character detecting unit does not detect the character. With this construction, the apparatus is capable of distinguishing characters with similar stroke information and accurately recognizing characters using off-stroke information between strokes.
Here, the character detecting unit in the character input apparatus comprises: a character candidate detecting unit operable to detect a plurality of character candidates; and a shortest character string detecting unit operable to generate character strings by combining character candidates detected by the character candidate detecting unit without changing an order of the character candidates and detect a character string which has the smallest number of characters as inputted handwritten characters. With this construction, the apparatus is capable of correctly recognizing handwritten characters without being equipped with a special word dictionary and the like.
The object of the present invention can also be achieved by a character input method comprising: a coordinate output step for outputting, when a user inputs handwritten characters, a coordinate string, that is sets of coordinates of points, for each handwritten stroke composing the handwritten characters; a stroke matching step for obtaining stroke information for each handwritten stroke from the outputted coordinate string of the handwritten stroke, comparing the stroke information with each set of standard stroke information registered in the stroke dictionary, and outputting stroke candidates; and a character detecting step for browsing orders of strokes for characters registered in the character dictionary using an order of a group of stroke candidates obtained in inputted order in the stroke matching step as a key and detecting a character whose stroke order matches the order of the stroke candidates. With this construction, the apparatus is capable of detecting correct characters for handwritten characters drawn one after another in a single character input frame.
The object of the present invention can further be achieved by a computer-readable storage medium which stores a program for making a computer perform following steps, the storage medium storing (a) a stroke dictionary in which sets of standard stroke information for a plurality of strokes are registered, each set of standard stroke information corresponding to a different stroke, and (b) a character dictionary in which stroke orders for a plurality of characters are registered, each stroke order corresponding to a different character, the program comprising: a coordinate output step for outputting, when a user inputs handwritten characters, a coordinate string, that is sets of coordinates of points, for each handwritten stroke composing the handwritten characters; a stroke matching step for obtaining stroke information for each handwritten stroke from the outputted coordinate string of the handwritten stroke, comparing the stroke information with each set of stroke information registered in the stroke dictionary, and outputting stroke candidates; and a character detecting step for searching the character dictionary using an order of a group of stroke candidates obtained in inputted order in the stroke matching step as a key, and detecting a character whose stroke order matches the order of the stroke candidates. By applying the storage medium that stores such programs to a character input apparatus without functions to recognize handwritten characters drawn one after another in a single input frame, the character input apparatus will be able to detect correct characters as handwritten characters.