The present invention relates generally to the display of digitally stored and/or processed images, and more particularly to a method and apparatus for displaying images on raster display devices such as laser printers and computer monitors.
Digital images can be efficiently stored, edited, printed, reproduced, and otherwise manipulated. It is therefore often desirable to convert an image, such as on a piece of paper, into a digital representation of the image by a process known as digitization. Digital representations of an image can be primitive and non-coded (e.g., an array of picture elements or xe2x80x9cpixelsxe2x80x9d) or may contain higher level descriptive coded information (e.g., ASCII character codes) from which a primitive representation may be generated. Generally, high level coded digital representations are more compact than primitive non-coded ones.
Optical character recognition (OCR) encompasses digitization and a method for transforming text in bitmap representation to a high level coded representation, such as ASCII character codes. In OCR digitization, text characters on a printed surface such as a sheet of paper are typically scanned by an optical scanner, which creates a bitmap of the pixels of the image. A pixel is a fundamental picture element of an image, and a bitmap is a data structure including information concerning each pixel of the image. Bitmaps, if they contain more than on/off information, are often referred to as xe2x80x9cpixel maps.xe2x80x9d
Other types of processes can also digitize real-world images. Devices such as digital cameras can be used to directly create bitmaps corresponding to a captured image. A computer system can recreate the image from the bitmap and display it on a computer display or send the bitmap to a printer to be printed. Bitmap generators can be used to convert other types of image-related inputs into bitmaps which can be manipulated and displayed. Incoming facsimile (fax) data includes low-resolution bitmaps that can be manipulated, recognized, printed, etc.
Once a bitmap is input to a computer, the computer can perform recognition on the bitmap so that each portion or object of the input bitmap, such as a character or other lexical unit of text, is recognized and converted into a code in a desired format. The recognized characters or other objects can then be displayed, edited, or otherwise manipulated from the coded data using an application software program running on the computer.
There are several ways to display a recognized, coded object. A raster output device, such as a laser printer or computer monitor, typically requires a bitmap of the coded object which can be inserted into a pixel map for display on a printer or display screen. A raster output device creates an image by displaying an array of pixels arranged in rows and columns from the pixel map. A bitmap of a coded object can be provided by retrieving an output bitmap stored in memory for the code, where each possible code has an associated stored bitmap. For example, for codes that represent characters in fonts, a bitmap can be associated with each character in the font and for each size of the font that might be needed. The character codes and font size are used to access the bitmaps. Another, more efficient, method is to use a xe2x80x9ccharacter outlinexe2x80x9d associated with each character code and to render a bitmap of a character from the character outline and other character information, such as size. A commonly-used language to render bitmaps from character outlines is the PostScript(copyright) language by Adobe Systems, Inc. of Mountain View, Calif. Character outlines can be described in standard formats, such as the Type 1(copyright) format by Adobe Systems, Inc.
OCR processes are limited by, among other things, the accuracy of the digitized image provided to the computer system. The digitizing device (such as a scanner) may distort or add noise to the bitmap that it creates. In addition, OCR processes do not perfectly recognize bitmap images, particularly if they are of low resolution or are otherwise of low quality. For example, a recognizer might misread ambiguous characters, characters that are spaced too closely together, or characters of a font for which it had no information.
Imperfect recognition can present problems both at the time of editing a recognized image and when printing or displaying the image. Misrecognized images may be displayed incorrectly, and images that are not recognized at all may not be displayed at all, or may be displayed as some arbitrary error image. This reduces the value of the OCR process, since the recognized document may require substantial editing.
The present invention provides a method and apparatus for creating a data structure describing coded objects and non-coded objects. The invention is applicable to recognizing text or other objects from a bitmap provided by an optical scanner or other bitmap generator. Objects that are recognized and not recognized by the recognizer are stored in the data structure. An apparently perfectly recognized document is provided by displaying the original bitmap, which is associated with the coded objects in the data structure.
The apparatus of the present invention includes a system for producing an image which includes a data processing apparatus and a recognizer for performing recognition on an input bitmap to detect objects within the bitmap. The recognizer creates coded portions from the objects for identifiable and non-identifiable objects. The system creates a data structure including coded portions corresponding to the identifiable objects and links to portions of the input bitmap that correspond to the identifiable objects. Coded portions of non-identifiable objects, and links to corresponding bitmap portions, are also preferably included in the data structure. An output device, such as a printer, a plotter, or a computer display, develops a visually perceptible image derived from the input bitmap. The image portrays the identifiable objects and the non-identifiable objects in their original bitmap form, so that no inaccurate images caused by misrecognition are displayed. An input device, such as an optical scanner, a digital camera, and a bitmap generator, can be included to provide the input bitmap to the data processing apparatus.
The objects of the bitmap that the recognizer can detect preferably include lexical units such as characters and words. The non-identifiable objects preferably correspond to unrecognized words which fall below a recognition threshold confidence level. The system preferably performs geometric correction to the input bitmap, which includes creating a distortion map of the bitmap and creating a layout correction transform from the distortion map and the bitmap.
The present invention further includes a method for producing a data structure from a bitmap of an image. The method, implemented on a digital processor, inputs a signal including a bitmap of an image and partitions the bitmap into a hierarchical structure of lexical units. At least one coded object is assigned to each lexical unit of a predetermined hierarchical level, where each coded object has an associated confidence level. Finally, a coded object is stored in the data structure and link data that links the coded object to its corresponding lexical unit. If a coded object has a confidence level greater than a threshold confidence level, then that coded object is considered identifiable. If no coded object for a lexical unit has a confidence level greater than the threshold confidence level, then the lexical unit is considered non-identifiable and is stored as the coded object for that lexical unit having the highest confidence level. The predetermined hierarchical levels preferably include a character hierarchical level and a word hierarchical level.
In yet another aspect of the present invention, a system for producing and manipulating a data structure includes a recognizer operating in a data processing apparatus that detects lexical units within the input bitmap. An analyzer creates and stores a data structure in memory of the data processing apparatus. The data structure includes coded identifiable objects and coded non-identifiable objects corresponding to lexical units within the input bitmap. A display device develops and displays an image of at least a portion of the data structure on a display device, such as a screen, by displaying the input bitmap. A display manager implemented on the data processing apparatus manipulates the image on the screen. The display manager includes an editor which permits the data structure and, thus, the image to be edited. The editor displays the coded data as rendered images and can be used to change a non-identifiable object into an identifiable object. The display manager also preferably includes a finder which searches the coded objects of the data structure to find an exact or approximate match to a search word or phrase. Lexical units which correspond to matched coded objects are preferably highlighted if they are currently being displayed.
In still another aspect of the present invention, a method for producing an image on a data apparatus includes performing recognition on an input bitmap to detect objects within the bitmap. A data structure is created to include coded portions corresponding to each of the objects and a non-coded portion, such as a word bitmap, corresponding to each of the coded portions. A visually perceptible image is then developed from the data structure. The image is derived from the non-coded portions of the data structure. Each of the objects preferably includes an associated confidence level, and non-identifiable objects correspond to unrecognized words which have a confidence level below a threshold confidence level. Objects having a confidence level below the threshold confidence level are displayed as said non-coded portions. Preferably, during said image developing step, the threshold confidence level is raised such that the confidence levels of all of the objects fall below the threshold confidence level, resulting in only the non-coded portions for all objects to be displayed. Steps for searching the data structure for an inputted word or phrase and editing the coded portions of the data structure are also preferably included.
An advantage of the present invention is that objects within a digitized image are displayed in their original bitmap form instead of as recognized images. There are thus no possible displayed errors from misrecognized images. A user displays an image which is identical to the source image.
Another advantage of this invention is that the data structure includes coded data that can be searched, edited, and otherwise manipulated by a user.
These and other advantages of the present invention will become apparent to those skilled in the art upon a reading of the following specification of the invention and a study of the several figures of the drawing.