1. Field of the Invention
This invention relates to computer input systems in general, and more specifically to an apparatus and handwriting alphabet for use in a handwritten input and recognition system used in personal computing systems such as xe2x80x9cpalm-topxe2x80x9d computers.
2. Description of Related Art
As computers have become increasingly popular for various applications, portable computers have been developed for a wide variety of uses. While many such portable computers use a traditional keyboard for input, for smaller computers, particularly including hand-held computers, the use of xe2x80x9cpensxe2x80x9d as an interface has been introduced as a way of making a small computer easier to use. With a pen interface, a user can place a pen or stylus directly on a touch-sensitive screen of the computer to control the software running on the computer and to input information: For many people, controlling a computer and entering text with a pen is more natural than using a keyboard.
An example of a prior art pen-based hand-held computer is shown in FIG. 1A. The illustrated hand-held computer 1 is typically about 4 inches by 6.5 inches, with the majority of one surface comprising a touch-sensitive display screen 2. The display screen 2 is typically a liquid crystal display (LCD) having a resolution of 256xc3x97320 pixels (although larger or smaller pixel arrays could be used). Various technologies can be used to sense the location of a pen or stylus 3 touched against the surface of the LCD screen 2 to indicate to the computer""s operating system the X-Y coordinates of the touch. Various hardware buttons 4 may be provided to control different functions, and/or to turn power on or off to the unit. In addition, a variety of software buttons or icons 5 may be provided, in known fashion, to indicate such functions as, for example, word processing or a delete function. Computer-generated information is typically shown on the display screen 2 as ASCII characters 6. One such hand-held computer is available as the xe2x80x9cZoomerxe2x80x9d from Casio Corporation.
A common characteristic of such pen-based computers is the use of electronic xe2x80x9cinkxe2x80x9d. xe2x80x9cInkxe2x80x9d comprises a series or trail of pixels changed (e.g., darkened or lightened) as a pen 3 is moved across the display screen 2 by a user, thus mimicking the application of real ink to paper.
Some prior art system designers suggest the use of unrecognized handwritten ink input. Although this approach works well for recording notes for personal use, it is not always suitable for data entry into a file which needs to be searched at a later date. In addition, ink requires considerably more storage space than ASCII characters. Accordingly, practical pen-based computers need a method of inputing text which usually includes some form of recognition system.
Various methods of recognizing handwriting are well known. One prior art approach is to provide a series of boxes in the input area (which is usually the display area) for entering character information. These systems use boxes for entry of text in an attempt to improve accuracy of recognition and to separate one character from the next. In these systems, an array of boxes is displayed and the user writes one character in each box. Although the boxes aid in improving the accuracy of recognition, most people find it awkward to write in boxes. Additionally, due to the number of boxes necessary to capture. even a short sentence, these systems are not very practical in a palm-top computer having a reduced data input area.
Another character recognition system is described in U.S. Pat. No. 5,125,039, entitled xe2x80x9cObject Recognition Systemxe2x80x9d, by the inventor of the present invention. In such a system, the user writes text without boxes in a free form manner. After a user inputs several ink characters, the computer applies special algorithms to separate the ink strokes into characters and then recognize each ink character as an ASCII character. It then replaces the ink representation of the characters drawn by the user with the standardized ASCII representation of those characters. Although these systems require less input area than boxed input systems, they are still difficult to implement on a palmtop computer having a small display. In addition, the computer has the additional burden of figuring out where one character ends and the next begins. This leads to recognition errors.
One additional major difficulty presented by prior art handwriting recognition systems is the delay time between text input and text recognition. The prior art systems typically require between 2 to 5 seconds after the user writes the ink character on the input tablet to recognize and display the ASCII character on a display device. In typical use, the prior art systems require the user to write a few words and then wait several seconds for the computer to start the recognition process. Alternatively, some systems (e.g., the xe2x80x9cNewtonxe2x80x9d from Apple Computer) perform recognition without the user stopping and waiting. But in these systems, the words are still recognized several seconds after they are written. In all cases, the user cannot immediately realize when a recognition mistake has occurred. This type of handwritten text recognition system makes error correction difficult because the user must constantly look at the display for words which the user input several seconds before in order to be sure that text was correctly entered and correctly interpreted. Moreover, once a user detects an error, error correction is difficult because the user has to first select the word or characters which need to be corrected.
In summary, three of the major problems with current handwriting recognition systems are the delay from writing to recognition, the limited writing area of palmtop computers, and the difficulty of accurately recognizing separate characters in non-boxed entry systems.
Therefore, an improved pen data entry solution is needed which can accurately and efficiently recognize text on a small display. It has become evident that one crucial characteristic of such an improved solution is the ability to instantaneously (i.e., with little or no perceptible delay) recognize and display input text, similar to the response of currently available personal computers using keyboard input devices. Palm-top computers having the ability to instantly recognize and display text offer the user the opportunity to quickly recognize and correct mistakes. Instant recognition also permits the use of smaller input areas because the input area can be reused for writing subsequent characters.
One of the major impediments facing xe2x80x9cinstantxe2x80x9d handwritten text recognition systems is presented by the multiple stroke (multi-stroke) characteristic of many English text characters. That is, many characters comprise more than one pen stroke. In this context, a single pen stroke is defined as a continuous movement of a pen while maintaining contact with a writing tablet. For example, the letters xe2x80x9cTxe2x80x9d, xe2x80x9cH,xe2x80x9d and xe2x80x9cExe2x80x9d typically comprise multiple pen strokes, while the letters xe2x80x9cSxe2x80x9d and xe2x80x9cOxe2x80x9d typically comprise a single pen stroke. Prior art recognition systems have had difficulty in achieving essentially xe2x80x9cinstantaneousxe2x80x9d recognition due to the fact that characters may comprise more than one pen stroke.
For example, due to the possibility that any given input character might be a multi-stroke character, it has been difficult to determine when a user has completed writing a one stroke (unistroke) character, or when the user is continuing to write a multi-stroke character. For example, a vertical line might represent the letter xe2x80x9cIxe2x80x9d or it could represent the first stroke in the multi-stroke letters xe2x80x9cTxe2x80x9d, xe2x80x9cHxe2x80x9d or xe2x80x9cExe2x80x9d. In the past, recognition systems have solved this ambiguity by waiting until the user stopped writing, or by having a fixed delay. period after which characters were recognized, or by detecting the start of a next stroke sufficiently far from prior strokes as to indicate a new character. Each of these approaches are deficient due to the recognition time delays introduced.
Recently, two approaches have been attempted for immediate recognition of handwritten text. Neither of these two approaches has proven wholly satisfactory. The first approach is offered by Sharp Electronics of Japan in their PVF1 handheld computer system, which provides xe2x80x9cimmediatexe2x80x9d recognition of both English and Japanese characters. The Sharp system uses a modified boxed input method. It displays several adjacent boxes on a screen for text input. Every text character is written into one of the boxes. Recognition timing delays are reduced because the system knows to begin recognizing a character previously written into a first entry box as soon as the user begins writing into another entry box. The recognized character is subsequently displayed upon the screen (not in the box) as soon as the recognition process completes. Expert users can quickly enter multiple characters by alternating between two adjacent boxes. This is different from previous boxed input systems where the user wrote characters in a left to right fashion in a multitude of boxes. The Sharp approach achieves faster recognition response on a reduced display area than previous systems. However, it suffers from several disadvantages.
Although the Sharp system uses fewer boxes (as little as two will suffice), the boxes still occupy a significant amount of screen area. In addition, as with all boxed input systems, the user has to be careful to always write within the box. If one stroke of a multi-stroke character falls outside the box, the character will be recognized incorrectly. This requires the user to carefully look at the screen at all times while writing. Another, and more serious drawback, is that the recognition of characters is not completely xe2x80x9cinstantxe2x80x9d. In this system, recognition of one character does not commence until the user starts writing a subsequent character. Although this system represents an improvement over the prior art systems where recognition delays were longer, recognition is still delayed. So, when the user writes just one character, or when the user writes the last character in a sequence, that character is not recognized until after a pre-determined time-out delay. This delay after writing a single character makes it frustrating and therefore impractical to make quick editing changes such as writing a xe2x80x9cbackspacexe2x80x9d character, or to insert a single character.
A second approach at immediate recognition of handwritten text was recently described by Xerox Corporation of Palo Alto, Calif. Xerox teaches a method whereby every character that a user wishes to write is represented by a single stroke glyph. Because every character is represented using a single stroke, recognition commences as soon as the user, lifts the pen from the writing tablet. The system provides improved recognition speeds over the Sharp approach and avoids the problems associated with the writing boxes used in the Sharp system.
However, the Xerox method suffers from two major disadvantages. First, the Xerox approach is difficult to learn because it requires the user to memorize an entirely new alphabet for entering text. The alphabet is specially designed to maximize the recognition abilities of the computer, not to maximize ease of learning. The Xerox disclosure recognizes this difficulty yet submits that the inefficiency of learning the alphabet is compensated by the improved recognition speeds once the user becomes an expert.
Second, the Xerox approach is difficult to implement with a full set of characters. The user must learn a single stroke representation for every possible character. Although this task may be feasible when representing only the 26 letters of the English alphabet in one case (upper or lower), there are many more characters requiring representation and recognition. For example, both upper and lower case English characters must be recognized. European languages have multiple accented characters as well as many other unique characters. In addition, a myriad of punctuation marks and mathematical symbols require representation. Assigning each of these characters to a unique single stroke glyph requires inventing many strange and novel glyphs that are non-intuitive and therefore difficult to learn by the average user. Compounding this difficulty is the problem of similarly looking accented characters (for example, A, xc3x81, À, xc3x84, and Â). Assigning unique glyphs for these characters would make the extended alphabet especially non-intuitive and difficult to learn.
The limitations of a unistroke alphabet as taught by Xerox are magnified when trying to create an immediate recognition system for Asian languages. For example, it is nearly impossible to define single stroke alphabets for Asian symbols, such as Japanese katakana or hiragana, Chinese kanji, or Korean hangul, due to the large number of symbols that need to be represented.
Accordingly, there is a need for an improved handwritten text recognition system capable of instantaneously and accurately recognizing handwritten text entries. There is also a need for an improved handwritten text entry and recognition system which is user-friendly, easy to learn, and easy to implement.
The present invention provides such a handwritten text recognition system.
The present invention uses a pen or stylus as an input device to a pen-based computer handwriting recognition system capable of interpreting a special pre-defined set of character strokes or glyphs. The invention teaches a system which provides true immediate character recognition, yet allows characters to be written with any number of strokes, thus making it natural to use and easy to learn. The present invention defines three different categories of pen strokes: (1) pre-character modifier strokes, (2) character or symbol strokes, and (3) post-character modifier strokes.
Pre-character modifier strokes precede character strokes and inform the present recognition system that subsequently entered character strokes are to be modified by the pre-character modifier stroke in a defined manner. They function primarily to control the interpretation of a subsequently entered character stroke. For example, a pre-modifier control stroke may indicate that the next character stroke is to be interpreted as a punctuation character. Pre-character modifier strokes may or may not cause an immediate visible display change. In the preferred embodiment of the invention, pre-character modifier strokes do result in a display change (by either changing a status indicator or by displaying a temporary character), so the user knows the pre-character modifier stroke was successfully entered.
Character strokes always cause a letter or other symbol to be displayed the moment the stroke is input on the writing tablet, interpreted in accordance with any pre-character modifier strokes previously entered. Any status indicators or temporary characters displayed due to earlier pre-character modifier strokes are removed upon recognizing a character stroke.
Post-character modifier strokes cause the recognition system to modify, in a defined manner, a character or symbol which was previously entered and displayed. For example, a post-character modifier may be used to add a diacritical mark to a character.
An important advantage of the present invention is its ability to recognize characters consisting of multiple pen strokes yet still provide instantaneous recognition and display of the recognized character. By combining mutually exclusive pre-character modifier strokes, character strokes, and post-character modifier strokes, a myriad of alpha, numeric, punctuation, and accented characters may be entered with natural and easy to learn styles.
The use of the three different types of strokes guarantees that the system always knows whether the user is starting a new character, has completed a character, or is modifying a previously recognized character. This enables the system to provide the immediate response that is desired.
It will be shown that the present invention is flexible and can be used to enter not only English and other Roman character-based languages, but other written alphabets, such as Japanese hiragana and katakana.
The details of the preferred embodiment of the present invention are set forth in the accompanying drawings and the description below. Once the details of the invention are known, numerous additional innovations and changes will become obvious to one skilled in the art.