Worldwide globalization has affected many industries, providing both tremendous opportunity and overwhelming problems. Many of these problems are related to the complex cultural differences between nations and their people, while some lie at a more basic level. One such basic problem is that of language. One industry in particular is acutely aware of this basic problem as language forms the basis for its operational systems and products. This industry is the computer industry.
While the computer industry has enjoyed a tremendous growth in the global market, problems relating to multi-language data input, processing, display, interchange, and printing have proved to be serious impediments to the realization of the potential growth of this industry on a worldwide scale. While humans are adept at mastering multiple languages, computer application programs and hardware drivers are generally written in one language to serve a primary market. Extension of these programs and drivers to other countries and other languages requires extensive redesign and re-coding, which delays the availability and increases the cost of such programs and drivers.
This problem exists because of the way that computers identify the various glyphs of worldwide languages. While computer programs operate internally on a binary basis, the requirement of a human interface forces the computers to display and print glyphs which are understandable to humans.
To allow for a readable human interface with the binary operation of the computer, various standards have been established to allow a computer to print and display human readable glyphs. One such standard is the American Standard Code for Information Interchange (ASCII) which utilizes a 7-bit code and 8-bit extensions to identify either 128 or 256 different glyphs respectively. While such a standard is adequate to display and print glyphs utilized in the English language, it does not include provision for many international characters used around the world. Therefore, to allow for display and printing of characters utilized in other languages, various other standards, such as the ISO International Register of Character Sets, the ISO/IEC 6937 and ISO/IEC 8859 families of standards, as well as the ISO/IEC 8879 (SGML) standards, were developed. Other national and industry standards were also developed (including code pages and character sets from Adobe, Apple, Fujitsu, Hewlett Packard, IBM, Lotus, Microsoft, NEC, WordPerfect, and Xerox).
Unfortunately, these various national and international standards cannot utilize common coding of their glyphs because only 256 separate glyphs can be addressed with 8 bits. This lack of a common representation for a given glyph code presents serious problems when international exchange of data through, for example, e-mail is considered. Specifically, if a user generates an e-mail message utilizing one national standard, transmits that message to a user in a different country whose computer operates on a different national standard, the characters displayed to the recipient of the e-mail message will quite likely be garbled. This is because his graphics device interface (GDI) will interpret the glyph codes differently than the application program from which the message was generated.
To further complicate the international computing language problem, many of the Far East languages utilize character sets which have well over 20,000 glyphs which must be displayed and printed. Unfortunately, a single byte coding of these characters will only be able to address a maximum of 256 of these over 20,000 glyphs by using all 8 bits in a single byte. Recognizing this problem, application program developers and computer hardware manufacturers have developed yet another coding standard which utilizes 2 bytes (16 bits) to identify these Far Eastern characters. However, as described above, the requirement of a separate interface type function requires additional programming, increased cost, and delayed availability of many programs originally developed for western application.
Recognizing this problem, the Unicode Consortium was formed in 1988 to develop a true global character identification standard. The goal of this consortium was to develop a standard which would allow the unique identification of all of the world's characters for every modern and many ancient languages. As a result of their efforts, the Unicode consortium has developed the Unicode Standard, now in version 2.1, available from Addison-Wesley Developers Press 1997, with updates and modifications available on the Internet at http://www.unicode.org. This standard is hereby incorporated by reference.
The Unicode Standard utilizes a double byte system (16 bits) which allows the unique identification of 65,536 separate characters. While this number is anticipated to be more than sufficient to individually identify characters from all the world's languages, one million additional characters are accessible through the surrogate extension mechanism, were two 16-bit code values represent a single character. While full implementation of the Unicode Standard is anticipated to overcome the problems described above, a vast majority of the computer hardware and software available and in use today does not recognize the Unicode double byte character identification standard. Much of the hardware and software existing in the western world currently only understands single byte characters, and therefore will continue to require additional coding to allow utilization on an international scale.
One such piece of computer equipment in widespread use is the computer printer. While a computer printer can print any glyph, English, international, as well as Far Eastern glyphs, the mechanism by which a majority of these international characters are printed greatly handicaps the printing performance of the device. This is because the printing of international characters is accomplished by the computer interpreting the international character as a bitmap graphic, and transmitting the bitmap image data to the printer to allow the printer to draw the international character as a bitmap picture of the character. This typically requires approximately 2,000 bytes of data to be transmitted to the printer to print a single international character. Even with the high speed, sophisticated equipment available today, this transfer of bitmap data to allow a printer to draw an international character greatly slows the printing performance of the device.
To allow for enhanced printing performance, most printers include device fonts which are resident within the printer itself and require only the unique identification of a glyph to be transferred to it to allow priority of that glyph. This simple transfer of a unique identifying code typically uses one of the above-identified 8-bit standards such as, e.g., ASCII. A font is a complete assortment of characters that have common design and size. Atypical font supports more than 256 characters. So, the characters within a font must be grouped into multiple symbol sets each having only 256 characters. Therefore, the symbol set identifies a specific collection of symbols provided by the font, with each symbol set being defined with a specific application in mind. For example, a German language symbol set will have German language specific symbols, while an English language symbol set will have only English symbols. Unfortunately, current printer drivers are unable to switch between symbol sets, requiring that separate drivers be provided to allow the enhanced printing performance of device fonts for each separate language. That is to say, a computer operating in America requires a different printer driver then a computer operating in, e.g., Germany to be able to use device fonts to print. As discussed above, this increases the development time, increases the cost, and delays the availability of these programs on an international basis. This problem is acute when importation into Far East countries is considered. As described above, the shear number of characters which must be printed requires 2 bytes to identify each character. Therefore, the development time and cost are greatly increased for release of these programs in the Far Eastern countries.
In addition to the tradeoff between the speed of utilizing device fonts with the associated increased development and support costs versus the reduced printing performance of printing international characters as bitmaps. Since a font describes not only the height, pitch, style, stroke, weight, typeface, and orientation of a character, but also its spacing, reporting the widths of the characters precisely to the application is very important for correct formatting. Specifically, most western characters are printed on a proportional basis whereby the space a character takes up on the printed page is proportional to its character width. That is to say, an “i” is given less space on a page then a “W”. Other languages, Far Eastern languages in particular, utilize fixed spacing for its characters, i.e. all characters occupy the same width regardless of the width of the individual character. If English were printed in a fixed pitch, both the “i” and “W” would be given the same width space on the printed page, regardless of the fact that the “i” has a much narrower width then the “W”. Unfortunately, current printer drivers are unable to switch between these printing formats, which results in the printed document being different than the same document viewed on a display. This is because, e.g., in Far Eastern countries documents containing both English and Far Eastern characters are all printed in fixed pitch.
There is, therefore, a need for a printer driver which is able to interpret, and therefore take advantage of, the new Unicode Standard, but which is also able to utilize the device fonts of existing printers. There is also a need for a printer driver which supports multiple symbol sets to allow full support of all characters in a font thereby precluding the need for separate printer drivers to be developed for particular countries' applications, and which is able to switch between fixed and proportional pitch printing for eastern and western characters in the same document.