1. Technical Field
The present invention relates in general to text string representations in data processing systems and in particular to representing text strings in a manner permitting character recognition by a user not familiar with the character set of a default language for the text string. Still more particularly, the present invention relates to automatically capturing pronunciation information for a text string containing an ideograph composed by phonetic spelling.
2. Description of the Related Art
Multinational companies often run information system (IS) networks which span multiple countries spread around the globe. To maximize the usefulness of such networks, operations within each country tend to run in the local language of the country. Where possible, names of abstract objects in user applications are in the local language and match the local language organization, city, or human names which the abstract objects represent. In the case of system management software, often abstract objects would represent each of a global enterprise""s local offices.
Central management of such a global network may be difficult or impossible when abstract object names utilize the local language and the local language""s underlying character set. For offices located in Egypt, abstract objects would most naturally be named in Arabic; for those in Japan, objects would be named in Japanese. A problem arises, however, when a enterprise""s headquarters IS staff attempts to examine these objects. The IS staff at the multinational headquarters located in the United States is unlikely to be able to read Arabic or Japanese.
Japanese, for example, is a logosyllabic or ideographic language which does not have an alphabet representing simple sounds, but instead has a very large character set with symbols (xe2x80x9cideographsxe2x80x9d) corresponding to concepts and objects rather than simple sounds. For instance, the Joyo Kanji List (Kanji for Daily Use) adopted for the Japanese language in 1981 includes 1945 symbols. User unfamiliar with the Kanji characters will have difficulty identifying a particular abstract object named in Japanese, as well as difficulty even discussing such abstract objects over the telephone with an English- and Japanese-speaking counterpart.
Additionally, merely seeing an ideograph may provide no clue as to the correct meaning or pronunciation since, in Japanese, the same character may have multiple meanings or pronunciations. For instance, the character depicted in FIG. 6A may mean either xe2x80x9cWestxe2x80x9d or xe2x80x9cSpainxe2x80x9d; the symbol depicted in FIG. 6B may be pronounced either xe2x80x9chayashixe2x80x9d or xe2x80x9crinxe2x80x9d (or xe2x80x9cinxe2x80x9d); and the characters depicted in FIG. 6C may be pronounced xe2x80x9csuga no,xe2x80x9d xe2x80x9csuga ya,xe2x80x9d xe2x80x9ckan no,xe2x80x9d or xe2x80x9ckan ya.xe2x80x9d This circumstance is based in part on the history of the Japanese language, in which the Kanji characters were adopted from the Chinese language. Thus, for example, the xe2x80x9crinxe2x80x9d symbol depicted in FIG. 6B is On-Yomi, basically a simulation of the Chinese pronunciation when the character was imported to Japan, while xe2x80x9chayashixe2x80x9d is Kun-Yomi, a Japanese word assigned to the character which has the same meaning.
It would be desirable, therefore, to capture and retain contextual meaning and pronunciation information associated with a text string for users unfamiliar with the character set employed by the language in which the text string was entered. It would further be advantageous to automatically capture such meaning and pronunciation information during composition of the characters entered into the text string.
It is therefore one object of the present invention to provide an improved method and apparatus for text string representations in data processing systems.
It is another object of the present invention to provide a method and apparatus for representing text strings in a manner permitting character recognition by a user not familiar with the character set of a default language for the text string.
It is yet another object of the present invention to provide a method and apparatus for automatically capturing and retaining pronunciation information for a text string containing an ideograph composed by phonetic spelling.
The foregoing objects are achieved as is now described. During composition of an ideograph which is entered into a text string in a data processing system by phonetic spelling on a typical data processing system keyboard, the keystrokes entered by the user are automatically captured and stored in a second field of a multi-field data packet into which the text string is being entered. The captured keystrokes thus provide a phonetic representation of the text string for users unfamiliar with the character set of the text string language. Intermediate representations, such as hiragana or katakana representations of a Japanese Kanji character, may also be automatically captured and stored in a third field within the multi-field data packet for other purposes. By switching the field displayed for the multi-field data packet containing the text string, a user may utilize the alternative representations to determine the correct meaning and pronunciation for the text string.
The above as well as additional objects, features, and advantages of the present invention will become apparent in the following detailed written description.