The invention relates to computing systems and more particularly, to methods and apparatus for recognizing a background in a multicolor image.
Text recognition techniques, such as optical character recognition (OCR), can identify text characters or objects in an image (the xe2x80x9coriginal imagexe2x80x9d) stored as a pixelmap in a computer and convert the text into corresponding ASCII characters. An OCR program can differentiate between text objects and non-text objects (such as the background) in an image based on intensity differences between the text objects and the background. This can be accomplished when the text characters and the background are two distinct colors.
However, the task of recognizing text in a multicolor image is more difficult. For example, an image may include text characters, background, and non-text characters, such as graphical objects, having different colors. Furthermore, different blocks of text in the image may have different combinations of colors. For example, one text block may have red text against a white background and another text block may have yellow text against a black background.
In addition to text recognition problems, multicolor images present an additional problem when attempting to reproduce the original image. Conventional OCR programs extract text from a pixelmap and the remaining information is typically represented as a colored rectangle. Thus, a cyan page with black text would conventionally be reproduced as a cyan rectangle with black text rendered on top of the rectangle. The reason for this is the extraction of the text may result in a text alignment for the rendered text that does not exactly align with the original pixelmap. As such, to ensure no gaps are produced in the final rendered image, the reproduction of a pixelmap after OCR is typically limited to simple background rectangles. When operating on a multicolor image, conventional OCR programs typically reproduce the text over a colored rectangle without regard for gradients or patterns found in the background portion of the original image.
In general, in one aspect, the invention features a method for identifying and reproducing a background of a pixelmap that includes dividing the pixelmap into a grid of tiles, determining for each tile a background component and building a representation of a background in the pixelmap using the determined background component for each tile.
Aspects of the invention can include one or more of the following advantages. The step of determining a background component can include comparing the derived background component for a tile to the background component determined for one or more neighboring tiles, and if they do not match, adjusting the background color component for the tile. The step of adjusting the background color can include determining if the tile is a picture tile or a text tile, and adjusting the background component of the tile to match neighboring picture or text tiles, respectively. The step of building a representation can include building a low resolution pixelmap for the background in the pixelmap. The step of determining a background component can include determining a background color. The step of determining a background component can include determining one or more background colors and a function defining a color transition in a given tile. The function can define a gradient of color distributed across the tile.
The step of determining a background component can include analyzing color distributions for each of the tiles, identifying tiles having two main colors, grouping two-color tiles having similar colors into two-color zones and identifying a background component for each two-color zone. The method can further include mapping pixels in each tile to a three-dimensional color space, and defining, for each two-color tile, a cylinder that encloses the pixels. The cylinder has a height and a radius. The method can include classifying a tile as a text block if the ratio of radius to height is less than a predefined value. The building a representation of a background in the pixelmap step can build a representation for each text block using the determined background component for each tile. The predefined value can be approximately 0.35.
In another aspect, the invention provides a computer-implemented method for recognizing and reproducing a background in a multicolor image stored in a computer. The method includes dividing the image into multiple blocks, analyzing color distributions for each of the blocks, identifying blocks having two main colors, grouping two-color blocks having similar colors into two-color zones, identifying a background color for each two-color zone and building a representation of the background using the determined background color for each two-color zone.
In another aspect, the invention provides a method for processing and reproducing a multicolor image represented as a pixelmap. The method includes dividing the pixelmap into a grid of tiles, determining for each tile a background component, building a representation of a background in the pixelmap using the determined background component for each tile, classifying each tile as one of either monochrome image or text tiles, processing the text tiles with an optical character recognition process to produce recognized text and reproducing the multicolor image. Reproducing the multicolor image include rendering the representation of the background and rendering the recognized text. The step of rendering the recognized text can include overlaying the recognized text over the rendered background.
In another aspect, the invention provides a data structure for a multicolor image and includes a file including a low resolution representation of the background of the multicolor image and a file containing recognized text characters located in the multicolor image.
In another aspect, the invention provides a data structure for a multicolor image and includes a file including background portion including a low resolution representation of the background of the multicolor image and text portion containing recognized text characters located in the multicolor image.
In another aspect, the invention provides a method for creating a renderable representation of a multicolor image and includes scanning a multicolor image to produce a pixelmap, producing a low resolution representation of the background of the multicolor image from the pixelmap, recognizing text characters located in the pixelmap and storing the recognized characters as text along with the low resolution representation of the multicolor image.
In another aspect, the invention provides a method for reproducing a multicolor image and includes scanning a multicolor image to produce a pixelmap, producing a low resolution representation of the background of the multicolor image from the pixelmap, recognizing text characters located in the pixelmap, storing the recognized characters as text along with the low resolution representation of the multicolor image and reproducing the multicolor image including rendering the representation of the background and rendering the recognized text characters including overlaying the recognized text characters over the rendered background.
Among the advantages of the invention are one or more of the following. The background of a multicolor image is examined carefully and stored in a compact form for use after text recognition. Gradients and patterns in the background can be reproduced and rendered along with recognized text in support of an OCR process.
Other features and advantages of the invention will become apparent from the following description and from the claims.