1. Field of the Invention
This invention relates generally to an encoding method and is more particularly directed to a method for encoding ideographic characters which permits machine retrieval from storage and reproduction of the characters.
2. Description of the Prior Art
The large number of characters involved in an ideogram-based language, such as Chinese, makes the reproduction and transmission of the language difficult. For example, to communicate in written Chinese with a moderate degree of proficiency, between three thousand and eight thousand different characters may be required, with each character or combination representing a word or expression. Consequently, prior art devices for typing or printing Chinese have been complex mechanisms requiring extensive training before a person could operate it with any degree of proficiency. A typewriter, patterned after the hand-setting of type in printing, may include several removeable compartmented trays or galleys containing the thousands of individual characters from which one character at a time is selected, struck against paper carried by the typewriter, and returned to its original storage position before the next character is selected, and the process repeated. Examples of such typewriters are disclosed in U.S. Pat. Nos. 1,245,633, issued to Sugimoto, and 4,064,983, issued to Inose et al. Another machine, as disclosed in U.S. Pat. Nos. 2,534,330 to Wong, has a keyboard with numerous keys, each of which controls two characters to provide a limited vocabulary of approximately two thousand characters. To efficiently operate these machines, a typist must be familiar with the language and must memorize the locations of the thousands of characters.
To simplify the typing or other reproduction of Chinese characters, methods have been developed to categorize all of the commonly-used characters according to strokes, groups of strokes or portions of characters which occur repetitively, and can thus be used as indices. By using such classification techniques, the large-number of characters can be categorized into smaller groups, thus permitting easier and faster location of the desired characters. Examples of devices which employ classification techniques are disclosed in U.S. Pat. Nos. 2,613,794 and 2,613,795, both issued to Yutang; 2,950,800, to Caldwell; 3,319,816, to Brown; and 3,325,786 to Shashoua et al. Again, the proficient use of these devices requires extensive training of an operator, who must be already familiar with the language, both in the operation of the machine and in the use of the specific technique employed to classify the characters in the vocabulary.
With the increasing availability of computers with large-capacity data storage capabilities, apparatuses have been developed which couple this storage capability with some classification technique to electronically store, retrieve and transmit and/or reproduce Chinese characters with greater ease than has been available. Examples of such apparatuses are disclosed in U.S. Pat. Nos. 3,820,644 issued Yeh; 4,096,934, issued to Kirmser et al.; 4,187,031 to Yeh; 4,144,405 to Wakamatsu; and 4,228,507, to Leban. In 3,820,644, each Chinese character is represented by a hexadecimal digital code and stored in a master file within a direct access storage apparatus. The characters are grouped according to the order of the Chinese phonetic alphabet, and a keyboard with numerous keys is used to select the desired character. The 8,000-plus characters of wide use are classified into groups according to the frequency of use, and each group is further divided into sections, with each section having fifteen characters of the same Standard Chinese Phonetic Syllable. The sections are arranged in alphabetical order according to the Chinese phonetic alphabet.
To retrieve or select the desired characters using the apparatus described in U.S. Pat. No. 3,820,644, several key strokes are required to select the appropriate group and section and then the correct character in the section. Reference charts of the appropriate classification groups and sections are used to assist the operator. The hexadecimal code of the retrieved character may be transmitted to another system, or may be converted to a binary code and displayed or printed on the appropriate equipment.
The method used in U.S. Pat. No. 4,096,934 for encoding and retrieving characters involves the use of phonetic symbol in accordance with a complex set of rules that relate to the category into which the character is grouped and the physical shape of the characters. Since phonetic symbols based upon the Mandarin dialect are used, problems arise when another dialect is involved.
In U.S. Pat. No. 4,187,031 a keyboard has keys which correspond to elements of the Korean alphabet and to a set of physical forms into which the characters are grouped, with each key stroke having a unique binary code. The alphabet elements and character forms are stored in hexadecimal form, and a computer links the keyboard and a storage device. To retrieve a character, the form of the character is first provided and then the alphabet elements are provided in proper sequence, with the computer being programmed to size and position the alphabet elements according to the word form.
In U.S. Pat. No. 4,228,507, a combination of letters are assigned to a grouping of arbitrarily-selected strokes used in writing Chinese characters, and numbers are used to designate the gross form of the character. The letters and numbers are assembled into a input code for a computer programmed to operate a plotter to reproduce the character.
From the foregoing, it is apparent that in attempting to provide techniques for computer-assisted reproduction or transmission of Chinese characters which are based upon the characteristic strokes used in writing Chinese, the prior art solutions have introduced complexities of their own, and generally required that the user at least having a working knowledge of the written language.