This invention is related to browser-based character encoding schemes. Web-based applications are accessed by client users from different brands of web browsers, e.g., Internet Explorer®, Netscape Navigator®, etc., different numbered versions thereof, e.g., Internet Explorer Version 5 and Version 6, Netscape Versions 4 and 5, etc., and different native language versions of the browsers, e.g., Netscape 4.7 English Version, Japanese Version, French Version, etc. Additionally, there are different choices of web servers, e.g., Apache®, Microsoft IIS®, etc., that run on many different operating systems, e.g., Windows NT®, Linux, Windows NT Japanese version, Windows NT French version, etc.
In a typical client/web server scenario, communication therebetween includes the client user sending requests and data from the client web browser to the application on the web server. The web-based application responds to the client requests and sends back data to the client web browser and for presentation to the user. The communication that occurs to facilitate this exchange of information utilizes one or more encoding/decoding standards that are supposed to work transparently to the user so that the information presented to the user is recognizable. Conventionally, however, this is not always the case.
There are many different character-encoding schemes used by client web browsers, operating systems, and web servers to accommodate the many different alphabets and languages becoming more pervasive on the Internet. Incompatibilities in encoding/decoding schemes produces unrecognizable character strings (i.e., “gibberish” data) when the client browser and the web server do not use the same encoding scheme.
The only compatible character set for all operating systems is 7-bit ASCII; however, this encoding scheme does not have a sufficient number of code characters to describe a complete character set utilized in many foreign languages.
When a user submits, for example, Japanese characters for a folder name in a Japanese encoding mode in a conventional process for creating a folder on a web server by a client, the client web browser translates the Japanese characters in its own encoding scheme (if charset=ISO-2022-JP, it will be JIS—multi-byte 8-bit high character), and the request is submitted to the web server. If the web server uses an English OS (operating system), the web server will not understand the JIS character set as a folder name, and thus the folder will not be created. As a result, the user cannot create a folder in his native language.
When the user submits Japanese characters in the English encoding mode, the Japanese characters are translated in its own encoding scheme (most likely in UTF-8) and the client web browser submits the request to the web server. If the web server uses the English OS, the web server does understand the characters and creates the requested folder in UTF-8. However, since UTF-8 uses the special characters & and #, which are special characters for the web and have a special meaning, the requested folder name is displayed in characters unrecognizable by the viewer when ultimately processed by the web server and displayed. Thus a user still cannot use a folder name in his native language.
When a first user submits Japanese characters in the Japanese encoding mode in a conventional process for storing the data inside a text file on the web server, the client web browser translates the Japanese characters using its own encoding scheme (if charset=ISO-2022-JP, it will be JIS—multi-byte 8-bit high character) and submits the encoded request to the web server. If the web server uses an English OS, the web server receives the data encoded in JIS, and stores the data in JIS to a text file.
When a second user requests the data inside the text file, and the second user uses the same type of web browser and has the same browser settings (i.e., JIS), the second user is able to recognize the characters as being presented correctly, since the browser is capable of decoding the JIS encoded text. However, if the browser of the second user utilizes a different encoding scheme (i.e., not JIS), the second user cannot view the data correctly, since his or her browser does not understand JIS encoding scheme. Furthermore, if the data is stored in UTF-8, the same problem exists that is described hereinabove with respect to folder names, since the special characters “&” and “#” have a special meaning for the web use.