1. Field of the Invention
This invention relates to the presentation of graphic symbols on a display in a data processing system, and more particularly to the means for parsing the data stream that represents the graphic symbols to be displayed.
1. Background Art
In a processing system, such as the IBM RT PC, having a monochrome display, a display manager regulates the output to the monochrome display. The display manager in the processing system interprets the data stream that is sent to the display using a fixed syntax. There is a character generator within the processing system which displays alphanumeric characters on the display according to this fixed syntax. In this type of system, there is no way to either vary the syntax used in interpreting the data stream, or to vary the representation of the displayed. alphanumeric characters created by the character generator. The representation on the display can be changed only by sending a different data stream to the display manager.
Similarly, in all points addressable, APA, displays, the data stream goes into the display manager where it is decoded by the fixed syntax in the display manager. However, after the data stream has been processed in the display manager, it can be displayed in various ways through different interchangeable fonts. The user can specify which font to use to display a data stream. Through these different fonts, a user can display different type styles such as italics or bold, and/or different sizes. Various other displayable aspects can be interchanged, also. At this point, because fonts are being changed, it is possible to change the interpretation of a code point within a given data stream.
For example, if the code point hexadecimal 41 is an "A", which is the way it is defined in the ASCIIl (American National Standard Code for Information Interchange) standard, in the monochrome display, not only is it displayed as an "A", but it is a specific embodiment of an "A". It is an "A" having a certain size, slant, and design. Specific picture elements, pels, are turned on to represent the "A" which can't
be changed. FNT .sup.1 Published by American National Standards Institute (ANSI)
Through the use of interchangeable fonts in APA displays, the code point hexadecimal 41 may be varied to be a different design of an "A" such as italic, or bold, or different size, etc. Also, by selecting a completely different font, the user can decide that the code point hexadecimal 41 is not an "A" at all, but is another graphical symbol.
A data stream is made up of code points which are all certain bit widths. A bit width which can be used in standard ASCII and which will be used in the description of this invention is eight bits, although other bit widths may also be employed such as sixteen bits, thirty-two bits, etc. Each byte that makes up a data stream is referred to as a code point. Because a byte is made up of eight bits, there are 256 code points from 0-255. With these 256 code points, one can express up to 256 different displayable graphical symbols.
The term "graphical symbol" includes ordinary alphanumeric characters along with other symbols. Displayable graphical symbols are referred to as "glyphs". An illustration of these 256 codes for a set of graphical symbols is shown, in FIG. 1A. However, not all of the 256 codes are used for displayable graphical symbols.
As shown in FIG. 1A, the first thirty-two code points 101-132 in the code page "P0" 100 are reserved for control codes 15. Control codes 15 are different from graphic codes 17. .Some of the control codes that are embedded in the data stream affect the format of the displayable codes on a display or printer output. The control codes listed in the ANSI standard control, format parameters such as backspace, horizontal tab, line feed, vertical tab, form feed, carriage return, shift out, shift in, and escape, etc. Escape is an important control code since it starts an escape or control sequence which is a multi-byte sequence. An escape specifies the beginning of a longer control sequence which are also defined in an orthodox way by the ANSI standard.
There are also communication controls such as acknowledge, no acknowledge, sync, cancel, start of header, and end of header. Not all control codes are supported by various manufacturers of processing systems. Without knowledge of which code points are control codes, the data stream cannot be adequately interpreted and formatted.
Other control codes are referred to as code page shift controls 115, 116, 129-132 (FIG. 1A). If a processing system has the capability of displaying more than 256 symbols, minus the code points required for control codes, then there is a display symbol range for a processing system. Typically, a full range of displayable symbols are divided into code pages, i.e, ranges of 256 symbols. A code page shifter is then needed to access these different code pages.
A code page is an organization of code points. One code page usually represents one set of 256 code points. For example, a first code page might say that a hexadecimal 41 is an "A". Another code page might say that a hexadecimal 41 is a "%". In the description of this invention, the standard ASCII code pages with some variations will be referenced as shown in FIGS. 1A, 1B, and 1C.
FIGS. 1A, 1B, and 1C represent three code pages. Code points hexadecimal 00 to hexadecimal 1F are control codes in all-three of the code pages. This says that these code points are outside the understanding of the code pages. These code points are control points regardless of which code page is being utilized.
A version of the ASCII standard used in the RT PC, called RTASCII, allows for code page shifting. Since more than 256 displayable codes are available, a method was defined to shift into another code page. In the standard RTASCII, the method was to send in the data stream a multi-byte control which would set up a code page "P0" 100 (FIG. 1A) and a code page "P1" 150 (FIG. 1B). These escape sequences loaded two different logical slots. For example, for the "G0" logical slot, "P0" code page would be utilized. For the "G1" logical slot, "P1" code page would be utilized. Once these code pages were loaded by this multi-byte control, a user could use a Shift In 116 (FIG. 1A) or Shift Out 115 (FIG. 1A) code which are single-byte control codes located in "0E" and "0F" hexadecimal positions in FIG. 1A. Then, if a Shift Out 115 were used in the data stream, the second code page would be utilized. Subsequent code points would then reference this second code page until a Shift In 116 code returned back to the, first code page. This is referred to as a locking shift since the subsequent code points are locked into the next code page until a subsequent shift code is sent.
For example, if code points hexadecimal 61, hexadecimal 62, hexadecimal 63 were sent in a data stream, they would be defined to be in the default code page "P0" 100 (FIG. 1A) and represented by the graphical symbols "a" 141, "b" 142, and "c" 143, respectively. If a Shift Out code 115 were received, it would be understood to go to the P1 150 (FIG. 1B) code page which would be the next 256 (minus the thirty-two control codes) symbols. If the code points hexadecimal 61, hexadecimal 62, hexadecimal 63 then followed the shift out code 115 in the data stream, the symbols 151, 152, 153 (FIG. 1B) would be represented.
Another method of shifting code pages is called a non-locking shift or single shift. The single shifts are "SS1" 132, "SS2" 131, "SS3" 130, and "SS4" 129. When these codes are received, only the next eight bits are interpreted in the code page specified. A different code page is accessed for only the next eight bits, and then the original code page is once again used.
In a non-locking shift, generally, only one code page is used most of the time. A second code page is utilized for only one symbol. For example, in text that has an equation, there may be a symbol in the equation that appears in a second code page. This may be the only time that symbol is ever used in the text document. Instead of shifting out of the first code page and into the second code page, and then shifting back into the first code page, it is more efficient to continue the data stream and use the "SS1" control code point hexadecimal 1F to get to another range of display symbols. The non-locking shift tells the display manager to look at the next eight bits. Those next eight bits are displayable by the code page defined by "SS1". After this, the display manager goes back to the original code page for the following eight bits.
The single shifts "SS1" to "SS4" are hexadecimal 1C to hexadecimal 1F. Since hexadecimal 1C to hexadecimal 1F is less than hexadecimal 20, the processing system knows that these are control codes and not displayable codes. When these four single shift codes are used, the display manager knows they are single shift codes. Not only does the display manager know that they are shifter codes, the display manager also knows exactly to where these codes shift. The display manager knows that a certain code will shift the base point 256 or 128 or whatever is needed to another code page. This is what is meant when the syntax knowledge is contained in the display manager.
The locking shift and single shift are the two RTASCII defined methods of getting to more than 256 displayable symbols. With either of these methods, the display manager must recognize the predetermined codes that are being used for code page shifters 115, 116, 129, 130, 131, 132. The display manager examines each byte in the data stream coming in, and if it is a displayable graphic symbol, it displays the graphical symbol according to the font pattern for that code in the font file. The display manager knows both the multi-byte control sequences and the various types of single byte controls that cause it to shift to another code page.
For example, if a hexadecimal 1F code is received in the data stream, the display, manager knows that the hexadecimal 1F is not displayable since it is a single-byte code page shifter "SS1" 132 (FIG. 1A). Therefore, the font is not accessed. The display manager stores the fact that there has been a code page shift. The display manager adjusts the base pointer which points to the beginning of the range of the display symbols which will be accessed by the next code point which, for correct processing, should be a graphic code. The next graphic code will be the offset from this base pointer.
A processing system 25 FIG. 2 known in the art is the IBM RT PC. Additional information on the RT PC can be found in IBM RT Personal Computer: General Information, Document Number GC23-0783-1. The processing system 25 which runs applications 21 has an
operating system 22 such as AIX.sup.2. FNT .sup.2 AIX is a trademark of IBM.
Additional information on the AIX operating system can be found in IBM RT Personal Computer: AIX Operating System Technical Reference, Document Number SC23-0808-0. The presentation of the display screen 23 is controlled through the display manager 28. The display manager 28 may receive input from the operating system 22, keyboard 26 or application 21 for display to the screen 23.
Previously, a processing system 25 was hard-coded, i.e. programmed with executable code, by the manufacturer of the processing system, to represent a processing model for a data stream. The term processing model 18 is used in the art to mean a set of rules that define which bytes in the data stream represent graphical symbols, and which bytes represent a control such as a code page shifter, etc. A processing model 18 essentially allows the processing system to differentiate the graphic codes from control codes for a particular code set. This was typically done in a display manager 28 which made hard-coded assumptions about the data stream that was sent to it.
For example, for a given standard data stream derived from ASCII, such as RTASCII, the hexadecimal codes 1C, 1D, 1E, 1F may be designated as code page shifters. The processing model in the display manager checks each byte in the data stream to see if it is one of these four control codes for page shifting.
If a different standard were used for the data stream, these same four hexadecimal codes might no longer represent control codes for page shifting, or additional codes might be considered to be code page shifters, as well. Therefore the display manager could not use the previous processing model for determining which codes are control codes and which codes are graphical symbols.
For example, the Japanese language is quite complex with over 6,000 graphical symbols. Consequently, more than four page shifters are needed. If there are four shifters, one can bump the base pointer to four different code pages. With over 6,000 displayable codes in 256 units, one needs a lot more shifters to get to the various different 256 units. Therefore, in a version of the Japanese Industrial Standard (JIS) called Shifted-JIS, there are additional control codes which are different from the RTASCII standard in order to support the complexity of that language.
The written Japanese language includes Romaji, the Roman alphabet, Katakana and Hiragana, which are phonetic alphabets, and Kanji, which consists of ideographic forms. Shifted-JIS standards describe the Japanese graphic character set and code pages for the greater than 6,000 graphical symbols used in the written Japanese language. The Shifted-JIS standards are further described in the publications titled "IBM Registry Graphics Characters Sets and Code Pages" document number C-H 3-3220-050, and "IBM Japanese Graphic Character Set, KANJI" document number C-H 3-3220-024.
The two code page systems, RTASCII and Shifted-JIS are incompatible. They are incompatible because the page shifters are not the same in the different code pages. In the Shifted-JIS code page 170 (FIG. 7), there are control codes 15 where other standard code pages have graphical symbols 17. For example, codes hexadecimal 81 to hexadecimal 9F in Shifted-JIS (FIG. 7) are code page shifters. They are not displayable characters. In RTASCII (FIGS. 1A, 1B, 1C), which is used for U.S. and NLS (National Language Support) data streams, those same codes are displayable symbols. Therefore the display manager which understands the syntax of RTASCII would try to display those characters if given the Shifted-JIS data stream. This would result in an error since each one of these languages has a distinct data stream syntax. As a result, the code pages of Shifted-JIS are incompatible with the code pages of FIGS. 1A, 1B, and 1C.
One approach is to build a Shifted-JIS processing system that is separate from the RTASCII NLS processing system. Separate processing systems would be needed to understand the different code pages and which different code points in each machine were control code shifters, and to understand how much each code shifter shifted the base pointer.
In order to handle a variety of data stream syntaxes that have different or additional control codes, such as the Japanese Industrial Standard (JIS), or the National Language Support (NLS), the display manager has to be recoded to now check for the newly specified control codes. In other words, a new processing model has to be created. As such, the same hard-coded (programmed) display manager cannot be used for different data streams having different code set representations.
It is known in the art for a manufacturer of a processing system to offer to its customers a processing system that allows a user to select a first or second data stream standard. In this case the manufacturer has programmed the display manager in two ways for two different processing models. If the user selects the first standard, the display manager invokes the first programmed routine representing a first processing model. If the user selects the second standard, the display manager invokes the second programmed routine representing a second processing model.
This approach is limited in its usability. First, the user, i.e. customer, is limited to the data stream standards that the manufacturer has previously chosen, and for which the display manager has been coded to meet the requirements of the specific processing model for the chosen data stream standard. Second, the user can send the data stream for display that uses only one standard or code set at a time. For example, if a first code set had codes hexadecimal 1C to hexadecimal 1F as shift code pages, and a second code set had codes hexadecimal 81 to hexadecimal 9F as shift code pages, the display manager could not intermix the displayable symbols from both of these code sets at the same time.