The present invention relates to computer-implemented methods and apparatus for displaying text images.
A text document, such as one on a piece of paper, can be converted into a digital representation by digitization. A digital representation of a document can be divided into lexical units such as characters or words and each unit can be represented in a coded or noncoded representation.
A coded representation of text is character based; that is, it is a representation in which the text is represented as recognized characters. The characters are typically represented by character codes, such as codes defined by the ASCII or Unicode Standard character encoding standards, but may also be represented by character names. The universe of characters in any particular context can include, for example, letters, numerals, phonetic symbols, ideographs, punctuation marks, diacritics, mathematical symbols, technical symbols, arrows, dingbats, and so on. A character is an abstract entity that has no inherent appearance. How a character is represented visuallyxe2x80x94e.g., as a glyph on a screen or a piece of paperxe2x80x94is generally defined by a font defining a particular typeface. In digital or computer-based typography applications, a digital font, such as any of the PostScript(trademark) fonts available from Adobe Systems Incorporated of San Jose, Calif., generally includes instructions (commonly read and interpreted by rendering programs executing on computer processors) for rendering characters in a particular typeface. A coded representation can also be referred to as a character-based representation.
A noncoded representation of text is a primitive representation in which the text is represented as an image, not as recognized characters. A noncoded representation of text may include an array of picture elements (xe2x80x9cpixelsxe2x80x9d), such as a bitmap. In a bitmap, each pixel is represented by one binary digit or bit in a raster. A pixel map (or xe2x80x9cpixmapxe2x80x9d) is a raster representation in which each pixel is represented by more than one bit.
Digitization of an image generally results in a primitive representation, typically a bitmap or pixel map. If the image contains text, the primitive representation can be interpreted and converted to a higher-level coded format such as ASCII through use of an optical character recognition (OCR) program. A confidence-based recognition systemxe2x80x94such as the one described in commonly-owned U.S. Pat. No. 5,729,637 (the ""637 patent), which is incorporated by reference hereinxe2x80x94processes an image of text, recognizes portions of the image as characters, words and/or other units, and generates coded representations for any recognized units in the image. Some units may be recognized only with a low level of confidence or not recognized at all. When the image is displayed, low-confidence units may be displayed in their original image form, while those recognized with sufficiently high confidence are displayed as rendered bitmaps derived from their coded representations.
A digital representation of an image including both coded and noncoded units can be displayed on a raster output device such as a computer display or printer. This type of display, i.e., one containing both portions of the original bitmap or pixel map and rendered bitmaps, will be referred to as a hybrid display. The coded units are rendered (i.e., rasterized), which may be accomplished in a variety of ways, such as by retrieving an output bitmap stored in memory for a code or by computing an output bitmap according to a vector description associated with a code. The result will be referred to as rasterized text. The noncoded units are displayed in their original image form, which will be referred to as a text pixmap. Typically, whole words are represented either as rasterized text or as a text pixmap for display on raster output devices.
Text pixmaps often exhibit color variation effects that result from improper color registration during digitization. This color variation may appear as xe2x80x9cedge effectsxe2x80x9dxe2x80x94fringes or halos around the edges of characters when a text pixmap is displayedxe2x80x94as shown in FIG. 3A, and may not be aesthetically pleasing. Such color variation effects may be especially noticeable when a text pixmap is displayed with rasterized text, which typically does not exhibit such effects.
Text pixmaps may also exhibit a xe2x80x9cghostingxe2x80x9d effect when displayed, resulting from the local background on which a text pixmap is typically displayed, as shown in FIG. 4A. When a text pixmap is to be displayed against a global background, such as where the pixmap is to be displayed with rasterized text in a hybrid display, this local background may not match the global background of the hybrid display on which the text pixmap and rasterized text is to be rendered. For example, the global background of a hybrid display may be assigned a color that is uniform over the entire global background. By contrast, the color of the local background of the text pixmap may vary over the pixmap. When the text pixmap is displayed against the global background of the hybrid display, ghosting may appear as a result of the color mismatch between the local background of the pixmap and the global background. As shown in FIG. 4A, this ghosting effect can be quite noticeable and can be aesthetically unpleasant.
In general, in one aspect, the invention provides a method and apparatus, including computer program apparatus, implementing techniques for processing an image including one or more lexical units each representing a unit of text, in which each lexical unit is defined by a number of image pixels and a number of background pixels. The techniques include generating at least one text mask distinguishing between at least the text pixels and the background pixels of at least one lexical unit in the image and storing the text mask in an electronic document representing the image.
Implementations of the invention include one or more of the following features. The text mask can include a raster representation of the pixels of at least one lexical unit, including a first set of pixels corresponding to the text pixels of the lexical unit and a second set of pixels corresponding to at least the background pixels of the lexical unit. Generating the text mask can include assigning the first set of pixels in the raster representation a first pixel value and assigning the second set of pixels in the raster representation a second pixel value. Generating the text mask can include reversing the pixel value of each pixel in the raster representation. Generating the text mask can include assigning an intermediate pixel value to pixels in the raster representation at a boundary between the first and second set of pixels in the raster representation. The text mask can include a vector representation of the text. If the image includes more than one page, a separate text mask may be stored for each page of the image. If the image includes lexical units representing units of text in more than one color, a separate text mask may be stored for the lexical units representing units of text in each color. If the image includes more than one lexical unit, a separate text mask may be stored for each lexical unit. The text mask can be stored in a hybrid data structure. The electronic document can include a page description language representation of at least one lexical unit in the image. Generating the text mask can include identifying the text pixels in an OCR process.
In general, in another aspect, the invention provides a method and apparatus, including computer program apparatus, implementing techniques for processing an image including one or more lexical units each representing a unit of text, each lexical unit being defined by a plurality of image pixels and a plurality of background pixels. The techniques include generating at least one text mask distinguishing between at least the text pixels and the background pixels of at least one lexical unit in the image and using the text mask to generate a representation of the lexical unit for display on an output device.
Implementations of the invention include one or more of the following features. The background pixels can represent a local background, and using the text mask can include generating a representation of the lexical unit for displaying the unit of text without displaying the local background. The text mask can include a raster representation of the pixels of at least one lexical unit, including a first set of pixels corresponding to the text pixels of the lexical unit and a second set of pixels corresponding to at least the background pixels of the lexical unit. If the image includes more than one page, a separate text mask may be used to generate representations of one or more lexical units on each page of the image for display. If the image includes lexical units representing units of text in more than one color, a separate text mask may be used to generate representations for display of the lexical units representing units of text in each color. If the image includes a more than one lexical unit, a separate text mask may be used to generate representations for display of each lexical unit. Using the text mask to generate a representation of the lexical unit for display can include assigning each of the first set of pixels a pixel value corresponding to a text color, and can also include assigning each of the second set of pixels a pixel value corresponding to a background color. Using the text mask to generate a representation of the lexical unit for display can include rendering the text on a global background. The image can include a background defined by a number of image background pixels and the global background can have a background color that is a function of one or more pixel values of the image background pixels, such as an average of the pixel values of the image background pixels. The text mask can be used to generate a representation of the lexical unit for display on a raster-output device, such as a printer, a raster scan display, and a digital typesetter. The text mask can be stored, such as in a hybrid data structure. The electronic document can include a page description language representation of at least one lexical unit in the image.
In general, in another aspect, the invention provides a method and apparatus, including computer program apparatus, implementing techniques for processing an image including one or more lexical units each representing a unit of text, each lexical unit being defined by a plurality of image pixels and a plurality of background pixels. The techniques include receiving an electronic document representing the image, including at least one text mask distinguishing between at least the text pixels and the background pixels of at least one lexical unit in the image and using the text mask to generate a representation of the lexical unit for display on an output device.
In general, in another aspect, the invention provides an electronic document representing an image including one or more lexical units each representing a unit of text, each lexical unit being defined by a plurality of image pixels and a plurality of background pixels. The electronic document includes a representation of the image in a page description language including a coded representation of at least one recognized lexical unit in the image and at least one text mask distinguishing between at least the text pixels and the background pixels of at least one unrecognized lexical unit in the image. In particular implementations, the page description language can be PDF, PostScript, RTF, HTML, SGML, XML or the like.
In general, in another aspect, the invention provides a method and apparatus, including computer program apparatus, implementing techniques for processing an image including one or more recognized lexical units each representing a unit of text, in which each lexical unit is defined by a plurality of image pixels and a plurality of background pixels and the image includes one or more recognized lexical units representing a recognized unit of text and one or more unrecognized lexical units representing an unrecognized unit of text. The techniques include generating at least one text mask distinguishing between at least the text pixels and the background pixels of at least one unrecognized lexical unit in the image and generating an electronic document representing the image. The electronic document includes a recognized text description representing the recognized lexical units and an unrecognized text description derived from the at least one text mask. In particular implementations of the invention, the electronic document can include a representation of the image in a page description language. The page description language can be PDF, PostScript, RTF, HTML, SGML, XML or the like.
Advantages that can be seen in implementations of the invention include one or more of the following. The use of a text mask to render text with a uniform text color can reduce the occurrence of color variation effects including edge effects. The use of a text mask to mask off a local text background can reduce the occurrence of ghosting effects. Using a text mask to represent text can require less storage space than using a pixelmap.
The details of one or more embodiments of the invention are set forth in the accompanying drawings and the description below. Other features and advantages of the invention will become apparent from the description, the drawings, and the claims.