The present invention relates generally to the display of digitally stored and/or processed images, and more particularly to a method and apparatus for displaying images on raster display devices such as laser printers and computer monitors.
Digital images can be efficiently stored, edited, printed, reproduced, and otherwise manipulated. It is therefore often desirable to convert an image, such as on a piece of paper, into a digital representation of the image by a process known as digitization. Digital representations of an image can be primitive and non-coded (e.g., an array of picture elements or xe2x80x9cpixelsxe2x80x9d) or may contain higher level descriptive coded information (e.g., ASCII character codes) from which a primitive representation may be generated. Generally, high level coded digital representations are more compact than primitive non-coded ones.
Optical character recognition (OCR) encompasses digitization and a method for transforming text in bitmap representation to a high level coded representation, such as ASCII character codes. In OCR digitization, text characters on a printed surface such as a sheet of paper are typically scanned by an optical scanner, which creates a bitmap of the pixels of the image. A pixel is a fundamental picture element of an image, and a bitmap is a data structure including information concerning each pixel of the image. Bitmaps, if they contain more than on/off information, are often referred to as xe2x80x9cpixel maps.xe2x80x9d
Other types of processes can also digitize real-world images. Devices such as digital cameras can be used to directly create bitmaps corresponding to a captured image. A computer system can recreate the image from the bitmap and display it on a computer display or send the bitmap to a printer to be printed. Bitmap generators can be used to convert other types of image-related inputs into bitmaps which can be manipulated and displayed. Incoming facsimile (fax) data includes low-resolution bitmaps that can be manipulated, recognized, printed, etc.
Once a bitmap is input to a computer, the computer can perform recognition on the bitmap so that each portion or object of the input bitmap, such as a character or other lexical unit of text, is recognized and converted into a code in a desired format. The recognized characters or other objects can then be displayed, edited, or otherwise manipulated using an application software program running on the computer.
There are several ways to display a recognized, coded object. A raster output device, such as a laser printer or computer monitor, typically requires a bitmap of the coded object which can be inserted into a pixel map for display on a printer or display screen. A raster output device creates an image by displaying an array of pixels arranged in rows and columns from the pixel map. One way to provide the bitmap of the coded object is to store an output bitmap in memory for each possible code. For example, for codes that represent characters in fonts, a bitmap can be associated with each character in the font and for each size of the font that might be needed. The character codes and font size are used to access the bitmaps. However, this method is very inefficient in that it tends to require a large amount of peripheral and main storage. Another method is to use a xe2x80x9ccharacter outlinexe2x80x9d associated with each character code and to render a bitmap of a character from the character outline and other character information, such as size. The character outline can specify the shape of the character and requires much less memory storage space than the multitude of bitmaps representing many sizes. A commonly-used language to render bitmaps from character outlines is the PostScript(copyright) language by Adobe Systems, Inc. of Mountain View, Calif. Character outlines can be described in standard formats, such as the Type 1(copyright) format by Adobe Systems, Inc.
OCR processes are limited by, among other things, the accuracy of the digitized image provided to the computer system. The digitizing device (such as a scanner) may distort or add noise to the bitmap that it creates. In addition, OCR processes do not perfectly recognize bitmap images, particularly if they are of low resolution or are otherwise of low quality. For example, a recognizer might misread ambiguous characters, characters that are spaced too closely together, or characters of a font for which it had no information.
Imperfect recognition can present problems both at the time of editing a recognized image and when printing or displaying the image. Misrecognized images may be printed incorrectly, and images that are not recognized at all may not be printed at all, or may be printed as some arbitrary error image. This reduces the value of the OCR process, since the recognized document may require substantial editing.
The present invention provides a method and apparatus for creating a hybrid data structure describing recognized and unrecognized objects. The invention is applicable to recognizing text or other objects from a bitmap provided by an optical scanner or other bitmap generator. Objects that are not recognized by the recognizer are stored and displayed using a portion of the original bitmap so that an apparently perfect recognized document is displayed.
The apparatus of the present invention includes a system for producing a raster image derived from a hybrid data structure including coded and non-coded portions from an input bitmap. The system includes a data processing apparatus and a recognizer for performing recognition on an input bitmap to detect identifiable objects within the bitmap. The system creates a hybrid data structure including coded portions derived from the identifiable objects. The hybrid data structure also includes non-coded portions derived from portions of the bitmap which do not correspond to the identifiable objects (non-identifiable objects). Finally, an output device, such as a printer, a plotter, or a computer display, develops a visually perceptible raster image derived from the hybrid data structure. The raster image includes newly-rendered raster images of the identifiable objects and scaled raster images of the non-identifiable objects. An input device, such as an optical scanner, a digital camera, and a bitmap generator, can be included to provide the input bitmap to the data processing apparatus.
The system preferably performs geometric correction to the input bitmap, which includes creating a distortion map of the bitmap and creating a layout correction transform from the distortion map and the bitmap. The identifiable objects of the hybrid data structure preferably include codes for recognized lexical units such as characters and words comprising the characters. The non-identifiable objects preferably correspond to unrecognized words which fall below a recognition threshold confidence level. Non-coded data is added to the hybrid data structure for the non-identifiable objects. The recognizer compares each of the identifiable objects with the portion of the input bitmap corresponding to the identifiable object to make size adjustments to the identifiable object if appropriate. The system preferably measures font attributes of the lexical units and assigns a typeface to each of the lexical units.
The present invention further includes a method for producing a hybrid data structure from a bitmap of an image. The bitmap includes identifiable objects and non-identifiable objects. The method, implemented on a digital processor, inputs a signal including a bitmap of an image and partitions the bitmap into a hierarchical structure of lexical units. Labels are assigned to a label list for each lexical unit of a predetermined hierarchical level, where each label in the label list has an associated confidence level. If a label in the label list for a lexical unit has a confidence level greater than a threshold confidence level, then that lexical unit is considered identifiable and is stored in a hybrid data structure as coded data. If no label in the lexical unit""s label list has a confidence level greater than the threshold confidence level, then the lexical unit is considered non-identifiable and is stored as non-coded data. A non-identifiable object is preferably stored as a bitmap together with a location at which to display the bitmap. The predetermined hierarchical levels preferably include a character hierarchical level and a word hierarchical level, and a lexicon is searched to determine if a label is a valid label.
In yet another aspect of the present invention, a system for producing and manipulating a hybrid data structure includes a recognizer operating in a data processing apparatus that detects identifiable objects within the input bitmap. An analyzer creates and stores a hybrid data structure in memory of the data processing apparatus, where the data structure includes coded data derived from the identifiable objects and non-coded data derived from bitmap portions which do not correspond to the identifiable objects. Non-identifiable objects associated with the non-coded data are also stored in the hybrid data structure. A display device develops and displays an image, derived from the hybrid data structure, on a display device such as a screen. A display manager implemented on the data processing apparatus manipulates the image on the screen. The display manager includes an editor which permits the hybrid data structure and, thus, the image to be edited. The editor displays the coded and non-coded data and can be used to change a non-identified object into an identified object. The display manager also preferably includes a finder which searches the hybrid data structure for a specified object by searching the hybrid data structure for a label in the label list of each lexical unit that approximately corresponds to a search word or phrase.
An advantage of the present invention is that unrecognized images within a body of recognized images are displayed as original bitmap portions instead of as misrecognized images or as error images. This allows a user to display a recognized image which appears to be virtually identical to the source image, yet store much of the information in a coded form.
Another advantage of this invention is that a hybrid data structure of codes for recognized images and bitmaps for unrecognized images is produced that can be searched, edited, manipulated, and displayed.
These and other advantages of the present invention will become apparent to those skilled in the art upon a reading of the following specification of the invention and a study of the several figures of the drawing.