1. Field of the Invention
The present invention relates to a computer system, and deals more particularly with translating identified information in structured documents into different languages.
2. Description of the Related Art
Companies have long recognized the desirability of “globalizing” or “internationalizing” computer software products. The globalization process is also sometimes known as providing “national language support” or “NLS”. A formalized definition of globalization is that it combines processes known as “internationalization” and “localization”. Internationalization is sometimes referred to as “NLS enablement”, and localization is sometimes referred to “NLS implementation”. Internationalization is the process of producing a product such that it is independent of any particular language, script, culture, and/or coded character set, and localization then adapts the internationalized product for a specific language, script, culture, and/or coded character set.
For example, if a software product displays menus to users, a globalized version of the product provides for translating the text (or at least some portion of the text) on the menus into the particular language preferred by the user. Similarly, software products that generate text messages for recording in an error log may be globalized such that the text messages will be recorded in a preferred language.
Early globalization efforts were focused on identifying and externalizing the text strings produced by a software product. That is, in order to translate the text strings into multiple languages efficiently, it was recognized that those strings should be not embedded inline within the code of the software product. Instead, tables (such as message tables) were defined to store the strings, and software products were written to use mnemonics or numeric identifiers which then could be used to index into the tables. Having the text strings externalized in this manner made translation easier, as a translator could simply substitute an appropriate version of each string in place within the table (or provide replacement tables in different languages), and the software would then access the translated text using the original mnemonic or numeric identifier.
Many of today's software products are written to produce and consume information which is represented using structured documents encoded in markup languages. Use of structured documents has also become increasingly prevalent in recent years as a means for exchanging information between computers in distributed networking environments. The Hypertext Markup Language, or “HTML”, as one example, is a markup language which is widely used for encoding the content of structured documents which represent Web pages. The Web page content can be transmitted between computers of the public Internet for rendering to users, and may also be used for other purposes (and in other environments such as private intranets and extranets). The Extensible Markup Language, or “XML”, is another markup language which has proven to be extremely popular for encoding structured documents. XML is very well suited for encoding document content covering a broad spectrum, not only for transmission between computers but also, in some cases, to enable automated processing of document content. XML has also been used as a foundation for many other derivative markup languages which are adapted for specialized use, such as VoiceXML, MathML, and so forth.
In view of the vast amount of content being encoded in structured documents today, and the increasing tendency to distribute such content throughout the world over distributed computing networks, techniques are needed for efficient and reliable globalization of content encoded in structured documents.