1. Technical Field
This invention relates generally to Chinese input technology. More particularly, the invention relates to a system and method for disambiguating phonetic entry.
2. Description of the Prior Art
For many years, the keyboard size has been a major size-limiting factor in the efforts to design and manufacture small portable computers because if standard typewriter-size keys are used, a portable computer must be at least as large as the keyboard. Although a variety of miniaturized keyboards have been used on portable computers, they have been found too small to be easily or quickly manipulated by a regular user.
Incorporating a full-size keyboard in a portable computer also hinders true portable use of the computer. Most portable computers cannot be operated without placing the computer on a substantially flat work surface to allow the user to type with both hands. The user cannot easily use a portable computer while standing or moving. In the latest generation of small portable computers, called Personal Digital Assistants (PDAs) or palm-sized computers, manufacturers have attempted to address this problem by incorporating handwriting recognition software in the device. Users may directly enter text by writing on a touch-sensitive panel or screen. This handwritten text is then converted by the recognition software into digital data. Unfortunately, in addition to the fact that printing or writing with a pen is in general slower than typing, the accuracy and speed of the handwriting recognition software has to date been less than satisfactory. In the case of Chinese language, with its large number of complex characters, the problem becomes especially difficult. To make matters worse, today's handheld computing devices which require text input are becoming smaller still. Recent advances in two-way paging, cellular telephones, and other portable wireless technologies have led to a demand for small and portable two-way messaging systems, and especially for systems which can both send and receive electronic mail (“e-mail”).
Pinyin input method is one of the most commonly used Chinese character input method based on Pinyin, the official system of sounds forming syllables for Chinese language which was introduced in 1958 by the People's Republic of China. It is supplementary to the 5,000-year-old traditional Chinese writing system. Pinyin is used in many different ways. For examples: it is used as a pronunciation tool for language learners; it is used in index systems; and it is used for inputting Chinese characters into a computer. The Pinyin system adopts the standard Latin alphabets and takes the traditional Chinese analysis of the Chinese syllable into initials, finals (ending sounds) and tones.
Mandarin Chinese has consonant sounds that are found in most of the languages. For example, b, p, m, f, d, t, n, l, g, k, h are quite close to English. Other initial sounds, such as retroflex sounds zh, ch, sh and r, palatal sounds j, q and x, as well as dental sounds z, c and s, are different from English or Latin pronunciation. Table 1 lists all initial sounds according to the Pinyin system.
TABLE 1Initial SoundsInitial SoundPronunciation sampleNoteGroup I: Same pronunciation as in EnglishMManNNoLLetterFFromSSunWWomanYYesGroup II: Slightly Different from English PronunciationPPunuse a strong puff of breathKColause a strong puff of breathTTongueuse a strong puff of breathBBumno puff of breathDDungno puff of breathGGoodno puff of breathHHotslightly more aspirated than inEnglishGroup III: Different from English PronunciationZHJewelerCHAs in ZH but with a strong puff ofbreathSHShoeRRunCLike “ts” in “it's high”, but with astrong puff of breathJJeffQClose to “ch” in “Cheese”XClose to “sh” in “sheep”
The finals connect with the initial sounds to create a Pinyin syllable which corresponds to a Chinese character (zi: _). A Chinese phrase (ci: _) usually consists of two or more Chinese characters. Table 2 lists all the final sounds according to the Pinyin system and Table 3 gives some examples illustrating the combination of initials and finals.
TABLE 2Final (ending) SoundsFinal SoundPronunciation sampleaAs in fatheranLike the sounds of “Anne”angLike the sound “an” with addition of “g”aiAs in “high”aoAs in “how”arAs in “bar”oLike “aw”ouLike the “ow” in “low”ongLike the “ung” in “jungle” with a slight “oo” soundeSounds like “uh”enLike the “un” in “under”engLike the “ung” in “lung”eiLike the “ei” in “eight”erLike the “er” in “herd”iLike the “i” in machineinAs in “bin”ingLike “sing”uLike the “oo” in “loop”unAs in “fun”
TABLE 3Putting Initials and Final (ending) TogetherPinyinPronunciation sampleNiLike “knee”HaoLike “how” with a little more aspirationDongLike “doong”QiLike “Chee”GongLike “Gung”TaiLike “Tie”JiLike “Gee”QuanLike “Chwan”
Each Pinyin pronunciation has one of the five tones (four pitched tones and a “toneless” tone) of Mandarin Chinese. A tone is important to the meaning of the word. The reason for having these tones is probably that Chinese language has very few possible syllables—approximately 400—while English has about 12,000. For this reason, there may be more homophonic words, i.e. words with the same sound expressing different meanings, in Chinese than in most other languages. Apparently tones help the relatively small number of syllables to multiply and thereby alleviate but not completely solve the problem. There is no paralleling concept of the tones in English. In English, an incorrect inflection of a sentence can render the sentence difficult to understand. But in Chinese an incorrect intonation of a single word can completely change its meaning. For example, the syllable “da” may represents several characters such as _ in first tone (da1) meaning “to hang over something”, _ in second tone (da2) meaning “to answer”, _ in third tone (da3) meaning “to hit”, and _ in fourth tone (da4) meaning “big”. The numbers after each of the syllables indicates the tones. The tones are also indicated by marks such as d_da_d_da_. Table 4 shows a description of five tones for the syllable “da”.
TABLE 4Five TonesToneMarkDescription1std—High and level2ndda—Starts medium in tone, then rises to the top3rdd—Starts low, dips to the bottom, then rises toward the top4thda—Starts at the top, then falls sharp and strong to the bottomNeutraldaFlat, with no emphasis
To enter a Chinese character using the Pinyin system, the user selects English letters corresponding to the character's Pinyin spelling. For example, on a standard QWERTY keyboard, when the user wants a Chinese character with a Pinyin of “ni”, he needs to press the “N” key and then the “I” key. After the “N” key and the “I” key are pressed, a list of Chinese characters associated with the Pinyin spelling “NI” is displayed. Then, the user selects the intended character from the list. This method is hereby referred as the basic Pinyin input method.
In a reduced keyboard system, such as one shown in FIG. 1, each key is associated with more than one letters of the Latin alphabet corresponding to each Pinyin syllable as shown in Tables 1 and 2. Thus a disambiguating method is needed for determining the correct Pinyin spellings that correspond to the input keystroke sequence.
A number of suggested approaches for determining the correct character sequence that corresponds to an ambiguous keystroke sequence are summarized in the article “Probabilistic Character Disambiguation for Reduced Keyboards Using Small Text Samples” by John L. Arnott and Muhammad Y. Javad (hereinafter as Arnott), which was published in the Journal of the International Society for Augmentative and Alternative Communication. Arnott notes that the majority of disambiguation approaches employ known statistics of character sequences in the relevant language to resolve character ambiguity in a given context. That is, existing disambiguating systems statistically analyze ambiguous keystroke groupings as they are being entered by a user to determine the appropriate interpretation of the keystrokes. Arnott also notes that several disambiguating systems have attempted to use word level disambiguation to decode text from a reduced keyboard. Word level disambiguation processes complete words by comparing the entire sequence of received keystrokes with possible matches in a dictionary after the receipt of an unambiguous character signifying the end of the word. Arnott points out several disadvantages of word-level disambiguation. For example, word level disambiguation often fails to decode a word correctly due to the limitations in identifying unusual words and the inability to decode words that are not contained in the dictionary. Because of the decoding limitations, word level disambiguation does not give error-free decoding of unconstrained English text with an efficiency of one keystroke per character. Arnott thus concentrates on character level disambiguation rather than word level disambiguation, and indicates that character level disambiguation appears to be the most promising disambiguation technique.
Still another suggested approach is disclosed in a textbook entitled Principles of Computer Speech, which was authored by I. El. Witten and published by Academic Press in 1982 (hereinafter as Witten). Witten discusses a system for reducing ambiguity from text entered using a telephone touch pad. Witten recognizes that for approximately 92% of the words in a 24,500 word English dictionary, no ambiguity arises when comparing the keystroke sequence with the dictionary. When ambiguities do arise, however, Witten notes that they must be resolved interactively by the system presenting the ambiguity to the user and asking the user to make a selection among the list of ambiguous entries. The user must therefore respond to the system's prediction at the end of each word. Such a response slows the efficiency of the system and increases the number of keystrokes required to enter a given segment of text.
Disambiguating an ambiguous keystroke sequence continues to be a challenging problem. As noted in the publications discussed above, existing solutions that minimize the number of keystrokes required to enter a segment of text have failed to achieve the necessary efficiencies to be acceptable for use in a portable computer. It would therefore be desirable to develop a disambiguating system that resolves the ambiguity of entered keystrokes while minimizing the total number of keystrokes required, within the context of a simple and easy to understand user interface. Such a system would thereby maximize the efficiency of text entry.
An effective reduced keyboard input system for Chinese language must satisfy all of the following criteria. First, the input method must be easy for a native speaker to understand and learn to use. Second, the system must tend to minimize the number of keystrokes required to enter text in order to enhance the efficiency of the reduced keyboard system. Third, the system must reduce the cognitive load on the user by reducing the amount of attention and decision-making required during the input process. Fourth, the approach should minimize the amount of memory and processing resources needed to implement a practical system.
The basic Pinyin method can be applied to a reduced keyboard input system when combined with a non-ambiguous method of input Latin alphabets such as the multi-tap method. All non-ambiguous method, however, requires lots of key strokes, which is especially burdensome when combined with the basic Pinyin method. Thus it is preferable to combine the basic Pinyin method with a disambiguating system. One approach is developed to disambiguate only one Pinyin syllable at one time by requiring the user to select a delimiter key, such as key 1 or key 0, between Pinyin spellings that correspond to multiple Chinese characters in commonly known Chinese phrases (_, i.e. a word with more than one character). The selection of the delimiter key instructs the processor to search for Pinyin syllables that match the input sequence and for Chinese characters associated with the first Pinyin syllable which may be selected by default. As shown in FIG. 1, the user is trying to input the Chinese characters associated with the Pinyin spellings NI and Y. To do this, the user would first select the ‘6’ key 16, then the ‘4’ key 14. In order to instruct the processor to perform a search for a syllable matching the keys entered, the user then selects the delimiter key 10 and finally the ‘9’ key 19. Because this process requires a delimiter key depression between commonly linked multiple Chinese character words, time is wasted.
Another significant challenge facing an application of word-level disambiguation is how to successfully implement it on types of hardware platforms on which its use is most advantageous, such as two-way pagers, cellular telephones, and other hand-held wireless communications devices. These systems are battery powered, and consequently are designed to be as frugal as possible in hardware design and resource utilization. Applications designed to run on such systems must minimize both processor bandwidth utilization and memory requirements. These two factors tend in general to be inversely related. Since word-level disambiguation systems require a large database of words to function, and must respond quickly to input keystrokes to provide a satisfactory user interface, it would be a great advantage to be able to compress the required database without significantly impacting the processing time required to utilize it. In the case of Chinese language, additional information must be included in the database to support the conversion of sequences of Pinyin syllables to the Chinese phrases intended by the user.
Another challenge facing any application of word-level disambiguation is how to provide sufficient feedback to the user about the keystrokes being input. With an ordinary typewriter or word processor, each keystroke represents a unique character which can be displayed to the user as soon as it is entered. However, with word-level disambiguation this is often not possible because each keystroke represents multiple letters in a Pinyin spelling and any sequence of keystrokes may match multiple spellings or partial spellings. It would therefore be desirable to develop a disambiguating system that minimizes the ambiguity of entered keystrokes and also maximizes the efficiency with which the user can resolve any ambiguity which does arise during text entry. One way to increase the user's efficiency is to provide appropriate feedback following each keystroke, which includes displaying the most likely word spelling following each keystroke, and in cases where the current keystroke sequence does not correspond to a completed word, displaying the most likely stem of a yet uncompleted word.