This invention relates to providing glyphs in a text document.
A character is the smallest component of written language having semantic value. A character refers to an abstract meaning and/or shape, rather than a specific shape. A glyph is a representation of a character. A glyph image is the actual concrete image of a glyph representation having been rasterized or otherwise imaged onto some display surface.
An encoded character is a character that is associated with an encoding value, for example, a scalar value included in a character set standard such as ASCII (American Standard Code for Information Interchange) or Unicode. An encoding value maps to a set of character attributes defining semantic information of the character. Character set standards are defined by standards organizations: for example, the ASCII standard is defined by ANSI, and the ISO Standard 8859 is defined by ISO (International Standards Organization). Character set standards are generally revised from time to time. Typically, when a character set standard is defined, the encoding values are simultaneously defined.
Character attributes can include one or more of the following: character case, character combining class, character directionality, character numeric value, mathematical character, character language, letter character, alphabetic character and ideographic character. Other character attributes are possible.
A glyph can be associated with a set of glyph attributes defining appearance information for a representation of the corresponding character. Glyph attributes can include one or more of the following: glyph shape, glyph metrics, typeface name, glyph baseline and glyph kerning. Generally, glyph attributes provide the information necessary to render the glyph image.
A font is a collection of glyphs and a corresponding encoding mapping. A font is typically constructed to support a character set standard. That is, fonts include glyphs representing characters included in the character set standard. When the character set standard is revised, the font manufacturer may need to revise the font to accommodate the changes, including the addition of new glyphs. In that case, a new font is re-issued conforming to the new character set standard. Revising fonts is costly for the designer and inconvenient for users who must track versions of the font and determine whether or not they have fonts supporting the latest character set standard.
Text documents typically include a text string that includes one or more encoding values that represent characters in the text. An encoding value can map to a character in a character set standard and to a glyph in a font constructed to support the character set standard. Thus, a text engine (e.g., a word processing application) processing an electronic document that includes a text string of encoding values can obtain character attribute information about an encoded character represented by the encoding value by mapping the encoding value to the character set standard. The text engine can also render a representation of the character, that is, a glyph image, based on glyph attribute information obtained from a specified font, using the same encoding value. The encoding value-attribute associations are typically available for a text engine to reference by looking them up in fixed and static tables, indexed by encoding value. The attributes are not part of the document itself, but are usually built into the text engine or the operating system used by the application.
A character can be processed based on its character attributes and/or glyph attributes. For example, a layout engine that is setting text in vertical writing mode might handle numerals in a specialized way, or might handle a currency symbol differently than numerals in some contexts. As another example, attributes can be critical for input methods, as the user may need to choose the character based on the corresponding glyph's radical, stroke-count or pronunciation (e.g., a software agent used to assist selecting Chinese/Japanese characters). Thus, for a representation of a character (i.e., a glyph) to participate fully in an electronic document, the character and glyph attributes of the character and corresponding glyph must be accessible by a text engine processing the electronic document.