The present invention relates generally to computer systems capable of recognizing handwriting. More particularly the invention pertains to systems for converting text displayed on computer screens from one text domain to another, such as from Hiragana to Kanji.
Computerized personal organizers are becoming increasingly popular. They perform such functions as keeping calendars, address books, to-do lists, etc. While these functions can be provided by conventional computer systems, they are more conveniently provided by personal organizers which are relatively inexpensive, small, and lightweight (i.e. portable). Personal organizers are available from such companies as Sharp and Casio of Japan.
A relatively new form of computer, the pen-based computer system, holds forth the promise of a marriage of the power of a general purpose computer with the functionality and small size of a personal organizer. A pen-based computer system is typically a small, hand-held computer where the primary method for inputting data includes a "pen" or stylus. A pen-based computer system is commonly housed in a generally rectangular enclosure, and has a dual-function display assembly providing a viewing screen along one of the planar sides of the enclosure. The dual-function display assembly serves as both an input device and an output device. When operating as an input device, the display assembly senses the position of the tip of a stylus on the viewing screen and provides this positional information to the computer's central processing unit (CPU). Some display assemblies can also sense the pressure of the stylus on the screen to provide further information to the CPU. When operating as an output device, the display assembly presents computer-generated images on the screen.
The dual-function displays of pen-based computer systems permit users to operate the computers as computerized notepads. For example, graphical images can be input into the pen-based computer by merely moving the stylus across the surface of the screen. As the CPU senses the position and movement of the stylus, it generates a corresponding image on the screen to create the illusion that the stylus is drawing the image directly upon the screen, i.e. that the stylus is "inking" an image on the screen. With suitable recognition software, the "ink" can be identified as text and numeric information. Some systems can recognize handwriting (ink) as characters, words, geometric shapes, etc. The recognized characters and words may also be identified as belonging to a particular language (e.g. Russian or English) or to a particular character set (e.g., Greek or Roman letters).
Text written in Japanese can contain alphabetic characters or numerals, but it consists principally of Kanji (Chinese characters) and syllabic characters (Hiragana and Katakana). There are some 2,000 to 3,000 common Kanji, and an additional 3,000 to 4,000 less common Kanji. In general, Kanji are structurally very complex with each single character containing on average from five to ten strokes. Some of the more complicated Kanji can contain 20 or more strokes.
Hiragana and Katakana are phonetic characters commonly used in Japan. Kanji are not phonetically based. Hiragana is reserved for traditional Japanese words that do not derive from other languages. Katakana, on the other hand, is reserved for words that originally found their way into Japanese from foreign languages. In comparison to Kanji, Hiragana and Katakana are generally less complex. Further, there are far fewer of them--only 49 for each of Hiragana and Katakana. Thus, single keyboards containing all of the Hiragana or Katakana are available.
As recently as ten years ago there was no way to input Kanji into a computer. Instead, Katakana characters were assigned to the keys of the ASCII keyboard and Japanese text was input, and displayed, phonetically. Another approach employed a special device consisting of a tablet on which 3,000 Kanji characters were printed. Further research led to systems employing Kana input (phonetic) with subsequent Kana-Kanji conversion.
Two dominant modes of this process are now employed. In the first, an ASCII keyboard is used to input Roman letters which are first converted to phonetic Japanese text (Hiragana or Katakana) and then converted to Kanji using a Kana-Kanji conversion. This approach has the advantage of requiring keyboards having only the 26 letters of the Roman alphabet. In the second dominant mode, Japanese text is input phonetically as Katakana or Hiragana through a keyboard or handwriting and then changed into Kanji using a Kana-Kanji conversion. This approach has the advantage of allowing persons not familiar with Romanization rules to input Japanese text.
Most input methods that use Japanese handwriting recognition allow the user to write the desired Kanji directly. Nevertheless, Hiragana or Katakana are often used in the following cases (1) the user is not sure, or has forgotten, how to write the Kanji, (2) the handwritten Kanji is not recognized by the system, (the order of the strokes used to write the Kanji is incorrect, or some other mistake is made), and (3) the time required to write complicated Kanji characters is too great. However, memos written entirely in Hiragana are difficult to read, thus generally requiring that the user rewrite at least some of the text using Kanji when more time is available. What is required is a method for quickly and easily converting text written in Hiragana or Katakana into Kanji.