Client side scripting languages such as JAVASCRIPT® by Netscape Communications of Dulles, Va. and VBSCRIPT® or JSCRIPT® by Microsoft Corporation of Redmond, Wash. do not include functionality enabling them to properly compose character values from encodings. Rather, these scripting languages treat characters within a string as atomic entities.
This handling of character values becomes an issue because there is a mismatch between the way the C programming language, which is the implementation language for many web servers such as APACHE HTTPD® maintained by the Apache Software Foundation of Forrest Hills, Md., and how these client-side scripting languages handle character data. Within the C programming language, characters are represented as arrays of small integer values (typically, 8 bits, although 16 bits per character is also possible). The C programming language relies upon a standard library to provide interpretation and rendering of character data. However, within the C programming language itself, the character data is just binary data. The same problem exists for web servers implemented in the C++ and similar programming languages that handle character data as arrays of integers.
In contrast, scripting languages like JAVASCRIPT® will not compose character values from encodings. When using a Unicode Transformation Format (UTF)-8 encoding of characters outside the range from 0-127, the client side programmers must be careful to handle the encoding/decoding correctly or the JAVASCRIPT® programs may generate strings with inappropriate encodings. For example, the character “n” corresponds to the code point code point U+3C0 in Unicode 2.0. In UTF-8, this is represented by two bytes of data 0xcf, 0x80 (in hexadecimal representation). A string containing the character “π” can be constructed by passing the value 0x3c0 to the String.fromCharCode( ) method in JAVASCRIPT®, but passing the bytes 0xcf, 0x80 will result in a two character string Ï <pad>, where the second character is actually a control character.
This scenario creates an additional problem in the context of cryptography, because most cryptographic algorithms operate on binary data without regard to character encodings. The cryptography algorithms rely on external systems to manage character data appropriately. However, these external management systems do not exist in scripting languages like JAVASCRIPT®. As a result, enciphering data at a server to be deciphered by the scripting languages at a client becomes unreliable when character data is involved.