The Internet, and in particular, the World Wide Web (WWW), is a large collection of computers operated under a client-server computer network model. In a client-server computer network, a client computer requests information from a server computer. In response to the request, the server computer provides the requested information to the client computer. Client computers are typically operated by individuals. Server computers are typically operated by large information providers, such as commercial organizations, government entities and universities.
To ensure the interoperability of the potentially different computers and computer operating systems in a client-server computer network, various protocols are observed. For example, the Hypertext Transport Protocol (“HTTP”) is used for transporting hypertext files over the Internet. In addition, the WWW observes a number of protocols for organizing and presenting information, such as the Hypertext Markup Language (“HTML”) protocol and the Extended Markup Language (“XML”) protocol.
The HTTP protocol, in particular, supports a feature known as “dynamically-generated customized pages.” A dynamically generated customized page comprises a set of information in a particular format. The same set of information can be presented in various ways, depending upon whether a particular format is desired, and supported, by the requesting client computer. For example, a first client computer may support the ability to present information in columns, while a second client computer may instead support the ability to present information in the form of a table. As a further example, the first client computer may be operated by a user in a Spanish speaking locale, while the second computer is operated by a user located in an English speaking locale. A server computer receiving an information request from the first client computer may dynamically generate the requested content in a column format and in the Spanish language, while responding to a request from the second client computer by dynamically generating the requested content in English and in the form of a table. Thus, two different versions of the requested content can be created to represent the same information.
Computer executable instructions are used to dynamically generate customized content. U.S. Pat. Ser. No. 5,740,430, entitled “Method and Apparatus for Server Independent Caching of Dynamically-generated Customized Pages,” issued on Apr. 14, 1998, to Rosenberg, et al. (the “Caching Application”), discloses a method and apparatus to efficiently respond to a large number of requests for customized content. In particular, the Caching Application discloses a method and apparatus for operating a client-server computer network such that a server computer dynamically generates and then stores customized pages requested from a client computer. Subsequent requests for previously generated customized pages (content) are responded to by retrieving the requested content from a cache in the server computer. Since previously generated customized pages need not be regenerated, computational overhead is reduced. The Caching Application is hereby incorporated by reference in its entirety.
Internet standards that govern web interactions, both at the semantic level, such as HTML (a content language) and HTTP (a transfer protocol) are derived from an ASCII (American Standard Code for Information Interchange)—based environment. When using only ASCII, language is primarily restricted to English, or ASCII derivatives of Western European languages. Therefore, most meta information associated with content that comes across a network in HTTP is intended to be ASCII. Meta information is typically encoded information transmitted along with the main data in a data transfer to provide additional information associated with the main data, such as creation date, authorship, formatting, locale information, language, etc. However, with the proliferation of Internet use, Internet content providers are faced with the need to support, among others, multi-lingual website visitors. The problem exists, however, that there is no clear way for a multi-lingual website visitor to announce to a content provider his or her language preference. In fact, the problem goes beyond determining a user's language preference and is a problem of determining a user's locale preferences. A user's locale can indicate not only a user's language preferences, but also other locale-specific information, such as the user's time zone, which can be used to indicate relative time differences between the user and the content provider. For example, a time indicator can indicate whether the user's locale supports daylight savings time, which can be important in performing time calculations for the timing of events.
Further, it is important to content providers to be able to provide content to a website user in a format that is useful and familiar to the user. For example, date/time formats, currency formats, monetary symbols, the use of dashes, commas and periods, etc., can vary greatly from locale to locale. Even within a locale, language and format variances can occur. For example, Spanish has two sorting orders and Chinese has five. A content provider, therefore, has a need to know a variety of demographic (locale-specific) information about a website user. Related U.S. patent application Ser. No. 09/931,228 entitled “A Method and System for Determining a Network User's Locale,” which was filed on Aug. 16, 2001 (the “Locale Detection Application”), discloses a method and system for automatically determining a network user's locale by various methods, including by the use of headers in the HTTP standard, by default assignment of locale, and by form posting. The Locale Detection Application is hereby fully incorporated by reference.
An HTML form post via HTTP is a primary means for website visitors to submit information to a content provider, yet it provides one of the most formidable problems in locale-specific data handling. In order for Internet content providers to correctly interpret user submitted form data, the encoding of the form data must be made known to the content provider's server side programs. Unfortunately, HTML version 3.2 form tags do not supply sufficient information about the encoding of form submitted data to a content provider's servers. The issues surrounding HTML form post data handling are critical issues that must be resolved to correctly capture user form inputs in a multi-lingual website.
A form post is a documented HTTP call to transmit selected form data from a user to a content provider's web server so that the web server can receive and process the form contents. For example, when a user (e.g., via a web browser) is presented with a form, such as an address form, the user can input his or her first name, last name, street address, etc., into the form. The user's web browser can collect the user's keystrokes into special fields (e.g., name fields) and perform the form post once the user submits his or her data (e.g., by pressing the “enter” key). The user's web browser may have Javascript, for example, running locally in the user's client computer to verify that entries have been made into each field, but the web browser will not process the data. The processing will instead happen at the content provider's server(s).
However, current HTML versions cannot adequately handle form posts for a locale-sensitive environment because HTML v3.2 form tags do not supply sufficient encoding information for the submitted data. Thus, when a user at a client computer is entering data, before he or she sends the data to a server, a content provider must be able to determine the encoding of the entered data and transmit the encoding information (e.g., in the form of a marker) to its servers along with the submitted data. Further, a content provider's server must be able to detect the marker that is transmitted along with the encoded data to indicate the encoding. The encoding marker can indicate to the server whether, for example, the data was entered in shift-JIS, or some other ASCII specification. Thus, current form post methods and systems cannot properly process data in locale-sensitive form posts because they cannot provide a means to indicate the data's encoding at the client computer, nor can they properly determining form post data encoding at the content provider's web server. A content provider using such current methods can thus not accurately serve locale-specific content to a user in response to a form post. Instead, an explicit registration process may be required for a user to indicate his or her locale preferences.
However, casual visitors to a website may have concerns, for example, over on-line privacy, that may dissuade them from actively registering at a content provider's website. Many casual visitors may be reticent to register, but may still desire to access locale-specific content, or at least locale-specific navigation. Automatic locale detection, such as disclosed in the Locale Detection Application, along with a means to accurately detect and forward the encoding format of form data to a content provider's server side programs, can be used to provide locale-specific content even to casual website visitors.