1. Field of the Invention
The present invention relates to a user interface for entry of ideographic characters into a computer.
2. The Prior Art
Computer entry of ideographic characters presents a difficult user interface problem. For example, in the basic Chinese "alphabet", there are 5,401 characters, each character corresponding to a different meaning or word. A keyboard having 5,401 keys is impossible to either build or use as a practical matter.
The prior art contains two different ways of approaching this problem. The first employs a keyboard where more than one key is required to enter a character. The second employs a tablet and a software handwriting recognition system that recognizes entire characters.
There are various keyboard entry methods for ideographic characters. For example, there is a keyboard, called the Ziran Input System, produced by Multi-Corp of Calgary, Canada, where each key corresponds to a particular stroke type (horizontal, vertical, L-shaped, etc.). In order to enter a character, a user must decide which shape of stroke is closest to the one he or she desired, then presses the corresponding key. There are other keyboard entry systems, where each key corresponds to a particular sound or subset of the character (known as a radical). See, e.g., U.S. Pat. No. 5,131,766 to Wong; U.S. Pat. No. 4,669,901 to Feng; or U.S. Pat. No. 4,684,926 to Yong-Min.
All of these keyboard entry systems are non-intuitive. Each requires an extensive practice period for proficiency. For example, for the Ziran Input System, a novice user typically uses a finger to write out the entire character on the table before pressing any keys. The difficulty arises from the fact that it is non-obvious which stroke type is actually closest the desired character. This determination requires some thought from the novice user.
For Asian ideographic character entry, another source of difficulty of entry is the number of strokes in a character. Some characters comprise up to 30 strokes.
There is a mitigating factor that improves the usability of the Ziran Input System. The system provides multiple hypotheses after every keypress. The hypotheses are a list of up to about 10 characters that are consistent with the keys pressed so far. Typically, the user only needs to enter in the first four to six strokes of a character before the system narrows the list of candidates down to one or two possible characters. A user can select one of the candidates from the list at any time. This incremental approach speeds up the entry rate by a factor of about 2-4.
The other approach for entering ideographic characters is to use a pen to naturally write an entire character on a graphics tablet. For example, see U.S. Pat. No. 4,561,105 to Crane, et al.; U.S. Pat. No. 4,365,235 to Greanias, et al., or U.S. Pat. No. 5,462,711 to Kitamura. A software recognition system attempts to find the character that most closely matches the strokes entered on the tablet. Some ideographic character recognizers only recognize neatly printed characters. However, neatly printing the entire character can take a long time. To speed up the character entry process, people will naturally connect strokes to form cursive writing. However, cursive writing can be very idiosyncratic and/or very sloppy, which results in poor accuracy when the software tries to recognize what was written.
In addition, prior art examples of ideographic character recognizers require a tablet with either a display immediately underneath the tablet, or a tablet that senses the proximity of the pen hovering above the tablet. These technologies have been hitherto necessary to allow the effective combination of writing strokes and selecting commands and/or character hypotheses. However, both of these technologies are expensive, and add cost to any handwriting recognition product.
U.S. Pat. No. 4,829,583 to Monroe, et al., describes a system that uses an input tablet to accept strokes of an ideographic character. The user starts by writing the first and last stroke of the character. The system then identifies the character, based on the beginning and ending points of these strokes, quantized to a 9.times.9 grid. If the character cannot be uniquely identified based on these two strokes, a list of candidates will be displayed to the user for selection. In an alternative embodiment disclosed by Monroe et al., the user can enter specific additional strokes to disambiguate the character identification. These additional strokes can be the second and the penultimate stroke, the last stroke of a radical, or the stroke immediately after the radical.
There are serious limitations in the ease of use of the system disclosed in Monroe. These limitations stem from the primitive recognition algorithm used. Since the recognition algorithm is a lookup table, with no error correction disclosed, the user must start and end the particular stroke in the exact 9.times.9 grid element required by the system. If the user starts or ends the stroke in an adjacent square, the system will fail to recognize the character. The system disclosed in Monroe, et al. supplies a grid overlay on the input tablet to assist in the drawing of the strokes. However, users often cannot remember the exact starting or ending square for all strokes of all 5401 characters. Furthermore, if the input tablet is very small, as in a touchpad, it is difficult to exactly hit the correct grid square while writing rapidly. Being forced to start and end strokes in the exact grid square is an error-prone, uncomfortable process. If starting or ending a stroke in the correct square is a 95% successful process, then after two strokes, the probability of recognition is. (0.95)4=81%. If six strokes are required for recognition, then the probability of recognition with the Monroe et al. system is (0.95)12=54%. If the character is very complicated, and requires 18 strokes to disambiguate it from other characters, the probability of recognition is (0.95)36=16%.
Clearly, the system disclosed in Monroe, et al. must limit itself to recognizing characters with only a very few strokes. It is well known to those skilled in the art that the first and last strokes are the most informative for Chinese characters. Monroe et al. take advantage of this fact, and thus the user must start by writing the first and last strokes of the character, possibly followed by key strokes near the end of the radical.
However, being forced to write the first, then the last, then possibly some arbitrary stroke in the middle of the character is very non-intuitive and error prone. If the user starts writing strokes in the well-known correct stroke order, it can typically take 3 to 18 strokes before the character is disambiguated, due to many characters sharing the same initial radical, which can have as many as 17 strokes. As discussed above, the Monroe, et al. system will have unacceptably high error rates if the user writes in the well-known correct stroke order. In addition, forcing the user to write the character in a non-natural manner by forcing the entry of strokes in an order other than the order in which they would naturally be written, comprises a clumsy and non-optimal user interface.
It is an object of the present invention to create an incremental entry method for ideographic characters that allows users to write strokes in the well-known correct order, while still maintaining a high accuracy rate.
It is another object of the present invention to provide an ideographic character input method that is as intuitive as printing a character on a tablet.
It is a further object of the present invention to provide an ideographic character input method which is faster and more accurate than standard ideographic character recognition.
Yet another object of the present invention is to provide a method for using low-cost tablets without proximity detection or a display in order to enter ideographic characters.
It is another object of the present invention to provide an incremental ideographic character input method which overcomes some of the shortcomings of the prior art.