Companies offering web services and service oriented architecture for international businesses include services that translate foreign languages into English. Automated translation tools can provide real time translation capabilities. Asian languages such as Chinese pose special problems for automated translation into English. For example, current automated techniques for translation from Chinese to English do not take into account the idiomatic differences between the languages. Consequently, automated literal translation of Chinese to English bears little resemblance to normal, everyday English, and proper translations can only be achieved by manual translation by Chinese language experts.
However, there exists a large body of Chinese literature translated into English. In the translation of the literature, idiom has been taken into account. There is a considerable wealth of such translated literature today. These translated literary works have already been expertly translated from Chinese to English and vice versa. Moreover, the technology exists to capture and store these translations digitally so that the data can be searched at high speed. For example, the availability of 64 bit computing and very large memories allow for efficient high speed searching of such captured translations. A variety of methods for converting Chinese literature to digital format are known to persons skilled in the art.
Chinese is a well structured language with specific character orders. Chinese has no spaces between words, but it does use commas to separate clauses and periods at the end of sentences. There are no spaces between words and each character carries a specific meaning. These characteristics of the Chinese language raise the possibility of using electronic files of translated literature for automated Chinese to English translation because the chance of finding a match in the wealth of literature is very high. In addition, the fact that Chinese has no real tenses, gender, cases or plurals, reduces the variety of different sentence structures considerably.
If such a search could be made, several problems exist. First, proper nouns may be translated in several ways. For example, in English our reference to the main Chinese river is “Yangtze”. However, in Chinese this refers to a fish that lives at the mouth of the river. The actual river itself is called the “Chang Jiang,” which literally means “long river.” Another example is Peking versus Bei Jing, or the “Imperial Palace.” Although Chinese has no capital letters, proper nouns can easily be identified from the structure of the sentence.
Second, modern Chinese in use today has many embedded western characters, such as numbers, names and web addresses. Embedded Western characters do not require translation, yet their positioning within the Chinese text to be translated can lessen the likelihood of a match. The accuracy of this translating can be improved if it is made independent of embedded Western characters. Modern Chinese uses many of these, particularly numbers.
What is needed is a tool or method to increase the accuracy of automated translation of Chinese into English by taking advantage of existing expert translations. In addition, need exists for further improvements in accuracy by addressing the impact of proper nouns and embedded Western characters to such a search.