Websites with static content, or static Web pages, are easy to translate into a foreign language, as may be desirable when localizing the Web site for display in a foreign country or to a linguistically-distinct sub-market. The content that the user will view is easy to collect for translation; since the content does not change it is simple to go to a page and copy the content. It is easy to present once translated; the user is redirected to a page with the translated content. However, websites with dynamic Web pages, i.e., Web pages with dynamically-generated content, present a greater challenge. The content the user will view varies depending on the user's actions. Often content that should be translated (such as text-based instructions) is mixed with content that should not be translated (such as product names).
A typical e-commerce Web site makes extensive use of dynamically-generated content. Consider, for example, a Web site that provides financial services such as Internet banking, i.e., services that enable banking transactions, bill payments, and the like over the Internet through, for example, a financial institution's secure web site. Any page on the site might present many different permutations of text and other content, depending on the configuration options selected by the financial institution and its customers, the end users, the end user's permissions, the end user's accounts, the application in use, and the end user's actions (including making errors). The challenge in translating the dynamically-generated content into a different language is to find all the different permutations of content that could be presented to the user, translate it accurately, and then provide the translated text appropriately.
One possible solution is to use automatic translation (or machine translation) programs. The page with dynamic text, once generated, could be passed through such a translator. However, the automatic translators known in the art have known shortcomings. While automatic translators are sometimes good enough to extract the sense of a page, if all of the source text has only one possible translation and there are no colloquialisms, automatic translations of more complex content are often confusing or ungrammatical, and sometimes incorrect. In addition, automatic translation cannot recognize specific content that should be translated in a specific way or content that should be neither translated nor stored. Automatic translation programs therefore are not preferred for real-time translation of content where accuracy and clarity are important considerations, such as in Internet banking applications.
Other concerns surface in connection with the process of collecting text for translation and maintaining or updating the database of translated text. It is possible, for example, to collect some portion of the Web site content during development mode, either by batch scanning static source code or object code files, or by intentionally viewing individual Web pages to identify translatable content. However, this process is time-consuming, cumbersome and is unlikely to capture all possible permutations of Web content on highly dynamic Web sites.
Additional concerns arise in connection with user flexibility and security. For example, many Web pages will contain some content that should be translated (e.g., the instructions on a data input page) and some content that should not be translated. Sometimes content should not be translated because it is the end user's own words. In an Internet banking checking application, for example, the end user's description of the payee and memo for a particular check are the end user's own words and should not be translated. Sometimes the content should not even be collected for translation, because the data may be confidential or proprietary and therefore should not be stored in a potentially-insecure database or exposed to 3rd-party contractors such as translators. Data such as account numbers and social security numbers fall into this category.