1. (Field of the Invention)
This invention relates to a Chinese character conversion apparatus for converting a Chinese character from an input row of phonetic symbols in a computer system.
2. (Description of the Prior Art)
The total number of traditional Chinese characters by the Big5 codes used in computer system is 13051. The key to the popularization of computer system in Chinese is how to quickly and correctly input Chinese characters. Presently, the input of a Chinese character in computer system can be achieved by way of speech recognition, script recognition and keyboard operation. Using a keyboard to input the Chinese character is yet still be the most reliable method, and is the most popular method. A Chinese character can be inputted by a user through the keyboard in accordance with the pronunciation or the form of the Chinese character to be inputted. Although inputting a Chinese character through the keyboard in accordance with the form of the Chinese character to be inputted has the advantage of faster input speed, however, it is difficult for the user to remember a large number of rules which are used to take apart a Chinese character into a plurality of parts for input. Therefore, in some places like Taiwan, most of the computer users prefer the use of Chinese phonetic symbols to input Chinese characters especially because they have been trained in Chinese phonetics since their elementary years.
The Chinese phonetic symbols which are assigned as the teaching material in elementary schools in Taiwan are shown below. The Chinese phonetic symbols can be grouped into four major groups, i.e. the vowels, transition vowels, consonants and tones. In the present invention, the symbol `. . . ` represents the first tone, however, it must be noted that, there is actually no symbol for the first tone.
Consonants: {character pullout}(b) {character pullout}(p) {character pullout}(m) {character pullout}(f) {character pullout}(d) {character pullout}(t) {character pullout}(n) {character pullout}(l) {character pullout}(g) {character pullout}(k) {character pullout}(h) {character pullout}(j) {character pullout}(q) {character pullout}(x) {character pullout}(zh) {character pullout}(ch) {character pullout}(sh) {character pullout}(r) {character pullout}(z) {character pullout}(c) {character pullout}(s) PA1 Transition vowels: {character pullout}(i) {character pullout}(u) {character pullout}(v) PA1 Vowels: {character pullout}(a) {character pullout}(o) {character pullout}(e) {character pullout}(e) {character pullout}(ai) {character pullout}(ei) {character pullout}(ao) {character pullout}(ou) {character pullout}(an) {character pullout}(en) {character pullout}(ang) {character pullout}(eng) {character pullout}(er) PA1 Tones: . . . (First tone), {character pullout}(Second tone), {character pullout}(Third tone), {character pullout}(Fourth tone), {character pullout}(light tone) PA1 a phonetic symbol memory unit for storing a plurality of symbols for consonants, transition vowels, vowels and tones; PA1 a dictionary for storing a plurality of syllable streams and the corresponding Chinese characters and phrases; PA1 a syllable severing unit, said syllable severing unit severing the phonetic symbols from the input phonetic symbol row to form syllable according to the tone symbol or space key, if no tone symbol or space key being inputted, said syllable severing unit severing the phonetic symbols from the input phonetic symbol row to form syllable according to the order rule of the arrangement of the consonant, transition vowel and vowel in said phonetic symbol memory unit; a conversion initializing unit for setting the conversion starting location and the conversion length according to the syllable obtained from said syllable severing unit and the syllable stream constituted by the syllable obtained from said syllable severing unit and the previously input syllable; PA1 a conversion processing unit for repeatedly adjusting the syllable stream constituted by the conversion starting location and the conversion length according to the set conversion starting location and the conversion length; a dictionary searching unit for searching said dictionary for Chinese character with the syllable stream from said conversion processing unit as the searching key; PA1 a syllable editing unit, said syllable editing unit being operated by the user to amend the Chinese character searched from the dictionary due to the mistake of syllable severance; and PA1 a homonymous character/phrase selecting unit, said homonymous character/phrase selecting unit being operated by the user to select a correct Chinese character other than the Chinese character searched from the dictionary due to the mistake of the determination of the homonym.
Each syllable, or character sound, is constituted by the consonant, the transition vowel, the vowel or the tone. In addition to the tone, any one, two or all of the consonant, the transition vowel and the vowel may be simultaneously contained in one syllable. For example,
Chinese character Syllable (Tone included) {character pullout} {character pullout}{character pullout}{character pullout}. . . {character pullout} {character pullout}{character pullout}{character pullout}{character pullout} {character pullout} {character pullout}{character pullout}{character pullout}{character pullout} {character pullout} {character pullout}{character pullout}{character pullout} {character pullout} {character pullout}{character pullout}{character pullout} {character pullout} {character pullout}{character pullout}{character pullout} {character pullout} {character pullout}{character pullout}. . . {character pullout} {character pullout}{character pullout}. . . {character pullout} {character pullout}{character pullout}{character pullout}{character pullout} {character pullout} {character pullout}{character pullout}{character pullout} {character pullout} {character pullout}{character pullout}{character pullout} {character pullout} {character pullout}{character pullout}{character pullout}{character pullout} {character pullout} {character pullout}{character pullout}{character pullout}{character pullout}. . . {character pullout} {character pullout}. . .
As shown by the above example, a so-called syllable stream is constituted by a plurality of successive syllables, which can be converted into a row of Chinese characters. Each syllable may have at least one corresponding homonymous character. Presently, according to the Mandarin Dairy News-paper Dictionary, the total number of reasonable syllables is 1364. A reasonable syllable must have at least one corresponding homonymous, in addition to the reasonable syllable have to be arranged according to the acceptable arrangement sequence of the consonant, transition vowel, vowel and tone. For example, "{character pullout}{character pullout}{character pullout}{character pullout}" is not a reasonable syllable, since the arrangement sequence of the transition vowel and the vowel is exchanged. "{character pullout}{character pullout}{character pullout}{character pullout}" is also not a reasonable syllable, since it has no corresponding Chinese character, although the arrangement sequence of the phonetic symbols is correct.
Since there are many Chinese homonymous characters, the need to select a correct Chinese character from a plurality of homonymous characters after the completion of the input of the corresponding phonetic symbol(s) of each syllable reduces the input speed. Since the total number of Chinese homonymous phrases is less than the total number of Chinese homonymous characters in comparison, and since the total number of Chinese phrases is more than the total number of single Chinese word in a text, the need to select a correct Chinese character from a plurality of homonymous characters is reduced if the phonetic symbol input is in the form of Chinese phrase. In the recent years, by the combination of Chinese phrase input with the semantic and syntactic process, the percentage of getting a correct Chinese character by the phonetic input method reaches 95% and above, that is, the need to select a correct Chinese character/phrase from a plurality of homonymous characters/phrases is within 5%.
A conventional Chinese character conversion apparatus has been disclosed in the ROC patent application Ser. No 75105838. FIG. 5 is a block diagram of the conventional Chinese character conversion apparatus of the above ROC patent application. Reference numeral 100 denotes an input unit for inputting a row of phonetic symbols. Reference numeral 180 denotes a dictionary for storing a plurality of Chinese characters for conversion. Reference numeral 140 denotes a NCHAR register for storing the number of the syllables of the input row of phonetic symbols. Reference numeral 120 denotes a PTR register for storing the conversion starting position of the input row of phonetic symbols. Reference numeral 130 denotes a NP register for storing the conversion length of the input row of phonetic symbols. Reference numeral 150 denotes a comparator unit for decreasing the value of the NP register by one after the completion of the conversion of the character with a certain length so as to maintain the principle of giving priority to the conversion of the character with longer length. Reference numeral 160 denotes a conversion control unit. The conversion control unit 160 orderly moves the setting position of the PTR register 120 starting from the input initial position and determines whether there is a converted syllable. If there is no converted syllable, and the dictionary 180 has the corresponding character, the conversion control unit 160 converts the syllable. Reference numeral 170 denotes a dictionary searching unit for searching the dictionary 180 with the syllable from the conversion control unit 160 as a searching key. Reference numeral 190 denotes an output unit for outputting the Chinese character resulted by the conversion control unit 160.
The phonetic input method used in the aforementioned conventional Chinese character conversion apparatus is as same as the traditional phonetic input method. Although it consists with the habit of the traditional user, however, it has the following drawback:
There is an inconsistent tone between the pronunciation and the phonetic symbols. For example, the phonetic symbols for "{character pullout}{character pullout}" are "{character pullout}{character pullout}{character pullout}v {character pullout}{character pullout}{character pullout}v". The phonetic symbols for each Chinese character contain a tone symbol for third tone. However, in speaking, the third tone of the former Chinese character "{character pullout}" must be changed to the second tone, that is, the phonetic symbols for "{character pullout}{character pullout}" in speaking are "{character pullout}{character pullout}{character pullout}{character pullout}{character pullout}{character pullout}{character pullout}v". This may result in an incorrect tone symbol input. Furthermore, when the user cannot distinguish correctly the tone of a Chinese character, a correct Chinese character cannot be inputted by way of the phonetic symbol input method. Therefore, if the tone symbol can be omitted in the input of the phonetic symbols, the aforementioned problem can be avoided. Furthermore, the total number of key operations by the user can also be reduced. For example, if the input phonetic symbols for the Chinese characters "{character pullout}{character pullout}" are reduced to "{character pullout}{character pullout}{character pullout}{character pullout}{character pullout}{character pullout}", the Chinese characters "{character pullout}{character pullout}" can still be converted. It is because "{character pullout}" is a vowel while "{character pullout}" is a consonant, so that the two syllables "{character pullout}{character pullout}{character pullout}" and "{character pullout}{character pullout}{character pullout}" can be easily severed according to the acceptable arrangement sequence of the consonant, transition vowel, vowel and tone. However, in the situation of omitting the tone symbol, some syllables are not easy to be severed from another. For example, "{character pullout}{character pullout}{character pullout}" can be recognized as a single syllable and can be converted into a Chinese character "{character pullout}". However, "{character pullout}{character pullout}{character pullout}" can be recognized as two successive syllables, "{character pullout}{character pullout}" and "{character pullout}", and can be converted into a Chinese phrase "{character pullout}{character pullout}". In this case, the present invention determines it as a single syllable, so that the Chinese character "{character pullout}" is converted. If the user recognized that it is a wrong result, that is, "{character pullout}{character pullout}{character pullout}" should contain two successive syllables, a special symbol "'" may be added between the phonetic symbols that represent the two syllables, that is, "{character pullout}{character pullout}{character pullout}", so that the Chinese phrase "{character pullout}{character pullout}" can be correctly converted. It should be noted that the total number of homonymous characters may be increased in the present invention. It is because many Chinese characters having the same consonant, transition vowel and vowel but different tone becomes the homonymous characters of each other due to the omission of the tone symbol. For example, when the phonetic symbols "{character pullout}{character pullout}{character pullout}" are inputted, "{character pullout}" and "{character pullout}" become the homonymous characters of each other. When the phonetic symbols "{character pullout}{character pullout}{character pullout}{character pullout}{character pullout}" are inputted, "{character pullout}{character pullout}" and "{character pullout}{character pullout}" become the homonymous phrases of each other. The total number of syllables without the tone is 409. The selecting operation of a correct character/phrase from a plurality of homonymous characters/phrases by the user can be reduced by the enhancement of Chinese syntactic and semantic process to select automatically the character/phrase with top priority. Since the object of the present invention is to omit the input of tone symbol, the user can choose to input tone symbol in the case that there are many homonymous characters/phrases, and choose not to input tone symbol in the case that the tone of the character is ambiguous or the reduction of key operation is desired.
The following is an example of input phonetic symbols without tone symbol.
 Chinese character Syllable (Tone not included) {character pullout} {character pullout}{character pullout}{character pullout} {character pullout} {character pullout}{character pullout}{character pullout} {character pullout} {character pullout}{character pullout}{character pullout} {character pullout} {character pullout}{character pullout} {character pullout} {character pullout}{character pullout} {character pullout} {character pullout}{character pullout} {character pullout} {character pullout}{character pullout} {character pullout} {character pullout}{character pullout} {character pullout} {character pullout}{character pullout}{character pullout} {character pullout} {character pullout}{character pullout} {character pullout} {character pullout}{character pullout} {character pullout} {character pullout}{character pullout}{character pullout} {character pullout} {character pullout}{character pullout}{character pullout}{character pullout} {character pullout} {character pullout}
From the above example, it is found that one syllable can easy to be distinguished from another. On the other hand, "{character pullout}{character pullout}{character pullout}" can be separated to "{character pullout}" and "{character pullout}{character pullout}", "{character pullout}{character pullout}{character pullout}" can be separated to "{character pullout}{character pullout}" and "{character pullout}", "{character pullout}{character pullout}{character pullout}" can be separated to "{character pullout}{character pullout}" and "{character pullout}", "{character pullout}{character pullout}" can be separated to "{character pullout}" and "{character pullout}", "{character pullout}{character pullout}" can be separated to "{character pullout}" and "{character pullout}", "{character pullout}{character pullout}{character pullout}" can be separated to "{character pullout}{character pullout}" and "{character pullout}". However, it seems to be corrected that each group of the aforementioned successive phonetic symbols is determined as one syllable according to the present invention.