The present disclosure relates to methods and apparatus for segmenting a page of image data into one or more windows and for classifying the image data within each window as a particular type of image data. Specifically, the present disclosure relates to apparatus and methods for differentiating background from document content.
Image data is often stored in the form-of multiple scanlines, each scanline comprising multiple pixels. When processing such image data, it is helpful to know the type of image represented by the data. For instance, the image data may represent graphics, text, a halftone, continuous tone, or some other recognized image type. A page of image data may be all one type, or some combination of image types.
It is known in the art to separate the image data of a page into windows of similar image types. For instance, a page of image data may include a halftone picture with accompanying text describing the picture. It is further known to separate the page of image data into two or more windows, a first window including the halftone image, and a second window including the text. Processing of the page of image data may then be carried out by tailoring the processing of each area of the image to the type of image data being processed as indicated by the windows.
It is also known to separate a page of image data into windows and to classify and process the image data within the windows by making either one or two passes through the page of image data. Generally, images are presented to processing equipment and processed in a raster or other fashion such that at any given time, only a certain portion of the image data has been seen by the processing equipment, the remaining portion yet to be seen.
In a one pass system the image data is run through only once, whereas in a two pass system the image data is run through twice. The second pass does not begin until some time after the first pass has completed. A one pass method is generally quicker, but does not allow the use of “future” context to correct information that has already been generated. In a two pass method, information obtained for a third or fourth scanline may be used to generate or correct information on a first or second scanline, for example. In other words, during the second pass, “future” context may be used to improve the rendering of the image data because the image data was previously processed during the first pass.
During the first pass of a two pass method, pixels may be classified, tagged according to image type, and both the image video and classification tag stored. Such tags and image video may analyzed and the results used to associate pixels into larger windows. Statistics on the image video and classification tags for these windows may be gathered for each window, as well as for the area outside of all windows. After the first pass finishes but before the second pass begins, software may read the window and non-window statistics and may use calculations and heuristic rules to classify the delineated areas of windows and non-windows. During the second pass, the results of this classification, as well as the pixel tag and image video from the first pass, may be used to control optimized processing of the image video.
Typically, windows within a document image are detected as areas separated by white areas of the document. Exemplary methods and apparatus for classifying image data are discussed in U.S. Pat. Nos. 5,850,474 and 6,240,205 to Fan et al., each of which is incorporated herein by reference in its entirety. Typically, such windowing methods depend heavily on the luminance and/or chrominance of the video to delineate the boundaries of windows.