The present invention relates generally to the display of digitally stored and/or processed images, and more particularly to a method and apparatus for displaying images on raster display devices such as laser printers and computer monitors.
Digital images can be efficiently stored, edited, printed, reproduced, and otherwise manipulated. It is therefore often desirable to convert an image, such as on a piece of paper, into a digital representation of the image by a process known as digitization. Digital representations of an image can be primitive and non-coded (e.g., an array of picture elements or "pixels") or may contain higher level descriptive coded information (e.g., ASCII character codes) from which a primitive representation may be generated. Generally, high level coded digital representations are more compact than primitive non-coded ones.
Optical character recognition (OCR) encompasses digitization and a method for transforming text in bitmap representation to a high level coded representation, such as ASCII character codes. In OCR digitization, text characters on a printed surface such as a sheet of paper are typically scanned by an optical scanner, which creates a bitmap of the pixels of the image. A pixel is a fundamental picture element of an image, and a bitmap is a data structure including information concerning each pixel of the image. Bitmaps, if they contain more than on/off information, are often referred to as "pixel maps."
Other types of processes can also digitize real-world images. Devices such as digital cameras can be used to directly create bitmaps corresponding to a captured image. A computer system can recreate the image from the bitmap and display it on a computer display or send the bitmap to a printer to be printed. Bitmap generators can be used to convert other types of image-related inputs into bitmaps which can be manipulated and displayed. Incoming facsimile (fax) data includes low-resolution bitmaps that can be manipulated, recognized, printed, etc.
Once a bitmap is input to a computer, the computer can perform recognition on the bitmap so that each portion or object of the input bitmap, such as a character or other lexical unit of text, is recognized and converted into a code in a desired format. The recognized characters or other objects can then be displayed, edited, or otherwise manipulated using an application software program running on the computer.
There are several ways to display a recognized, coded object. A raster output device, such as a laser printer or computer monitor, typically requires a bitmap of the coded object which can be inserted into a pixel map for display on a printer or display screen. A raster output device creates an image by displaying an array of pixels arranged in rows and columns from the pixel map. One way to provide the bitmap of the coded object is to store an output bitmap in memory for each possible code. For example, for codes that represent characters in fonts, a bitmap can be associated with each character in the font and for each size of the font that might be needed. The character codes and font size are used to access the bitmaps. However, this method is very inefficient in that it tends to require a large amount of peripheral and main storage. Another method is to use a "character outline" associated with each character code and to render a bitmap of a character from the character outline and other character information, such as size. The character outline can specify the shape of the character and requires much less memory storage space than the multitude of bitmaps representing many sizes. A commonly-used language to render bitmaps from character outlines is the PostScript.RTM. language by Adobe Systems, Inc. of Mountain View, Calif. Character outlines can be described in standard formats, such as the Type 1.RTM. format by Adobe Systems, Inc.
OCR processes are limited by, among other things, the accuracy of the digitized image provided to the computer system. The digitizing device (such as a scanner) may distort or add noise to the bitmap that it creates. In addition, OCR processes do not perfectly recognize bitmap images, particularly if they are of low resolution or are otherwise of low quality. For example, a recognizer might misread ambiguous characters, characters that are spaced too closely together, or characters of a font for which it had no information.
Imperfect recognition can present problems both at the time of editing a recognized image and when printing or displaying the image. Misrecognized images may be printed incorrectly, and images that are not recognized at all may not be printed at all, or may be printed as some arbitrary error image. This reduces the value of the OCR process, since the recognized document may require substantial editing.