With the unbelievable pace at which the Internet has grown over the last ten years, the business world has become integrally “wired” through e-commerce. As the reach and technology of the Internet increased, e-commerce became a natural extension of the old-business world. With this extension, the ability to market and sell goods and services across borders using the Internet became commonplace. However, as global e-commerce grows, new issues have emerged regarding the extension of an American-centric Internet to non-English speaking countries. In today's global village, web developers face the task of creating a new paradigm for the globalization of the Internet.
Much of the current discussion concerning web pages centers around the format in which electronic documents and programs are written. For example, Hypertext Mark-Up Language (HTML) is one of the languages that a programmer or web developer uses to design and present the format of a web site or web page. HTML is a tag-based, format-related language, meaning that specialized tags are used to “mark up” or format the content of the web pages (e.g., “<b> This text is bold. </b>” would trigger the browser to format the enclosed text in bold-face type). When the HTML document is communicated to a desktop browser, such as Microsoft's INTERNET EXPLORER™, Netscape's NETSCAPE NAVIGATOR™, or the like, it generally knows by reading the HTML tags how to render and configure the appearance of the information in the browser. HTML tags include content tags, which describe the format of the tagged content, and containment tags, which delimit areas of containment in the HTML document. Other markup languages such as Wireless Markup Language (WML) which is used mostly for wireless devices, like cell phones are also used by developers.
HTML is a sub-application of a much more extensive meta language (i.e., a language that uses meta data or tags to describe or mark-up data), Standard Generalized Mark-up Language (SGML). SGML was designed to be a standard way of marking up data and was used extensively in large document management systems. However, because of its intended universal application, SGML is a very complicated language and, consequently, not generally suitable for data interchange across the Internet. Its complexity typically requires large parsers which would not be efficient or compact enough for effective Internet use. HTML was developed to capture the information display aspect of SGML in a much more compact and efficient package.
In developing multinational web sites, web developers typically design web pages in a single “spoken” language. This is due to file encoding restrictions. If a developer writes a document in Japanese and saves that document using English encoding, the Japanese character coding is often altered, and the characters may not be properly displayed on subsequent accesses to that file. Therefore, documents written in a different language, other than Latin-1 encoded language, should generally be saved according to its own specific language encoding requirements. Because of these encoding restrictions, one web page must typically be created and stored for each different language desired.
In the early stages of the Internet, web sites and web pages typically delivered only static content (i.e., information that did not change on a regular basis). Information and format were simply hard-coded in pure HTML to be presented to and displayed by the browser. However, with the expansion of the Internet and e-commerce, most information is now delivered dynamically. For example, a web site may include a product catalog, a shopping cart, or a section of new product descriptions that may require continual update of the underlying information. Depending on the nature of the business, the information may change anywhere from every month, to as often as every hour or less. Dynamic/distributed systems were developed to facilitate the flexibility of such web sites by allowing the dynamic information to be placed into a database accessible by the web/application server and the web developer. These dynamic/distributed web pages are coded with HTML and may include executable code that would facilitate accessing the databases for the necessary dynamic information exchange. Such web pages are often referred to as server pages. Server pages, which reside and are executed on the server-side, typically comprise HTML or similar format-sensitive mark-up code with embedded executable source code, such as Sun's JAVA™, Macromedia's COLDFUSION™ Mark-up Language (CFML), or the like. The embedded source code is executed by the server to provide processing or database interaction. The processed data is then filled into the server page which is eventually constructed into the displayed web page using the HTML format-descriptive code.
With the proliferation of dynamic/distributed web sites, another SGML-subset meta language is seeing more application. Extensible Mark-up Language (XML) was created with the same purpose in mind as SGML, but without much of the same complexities. XML allows tagging or marking of data for providing description of and/or structure to the data as opposed to simply effecting the formatting of the data, as in HTML. XML is generally used to increase the functionality of the dynamic Internet. More and more web applications are taking advantage of XML's power and flexibility by using it to facilitate data interaction.
Another problem exists with the current systems and methods for providing multilingual web sites. Typically, as discussed above, a web site will be designed and developed in a single, primary “spoken” language. The web developers may code the entire web site in this primary language. Once the site has been developed, it is then typically given to a translator to translate it into each of the languages desired for support. The problem arises when the web site is very large. The translations of such sites require a considerable amount of time. Furthermore, with full access to the web site source code, there exists a chance the translator may unintentionally damage or destroy some of the source code. These problems may unfavorably delay the roll out of the web site, depending on which languages are desired, and/or whether any damage is done to the source code.
A common solution for this problem is to involve the translator in the development process at the early stages. However, because developers will typically continue to change and refine the content, design, and layout of web pages, the translator would typically need to continually update the translations. Translators usually charge based on the number of words translated. Therefore, requiring multiple re-translations during development may not be the most cost-effective means of obtaining multiple language support. Furthermore, it is currently not possible, in the traditional web development process, to preserve previous translation to minimize subsequent translations.