1. Field of the Invention
The invention relates to a method of segmenting a composite image of pixels into a number of fields corresponding to lay-out elements of the image, the pixels having a value representing the intensity and/or color of a picture element. The invention further relates to a device implementing the method, which device comprises an input unit for inputting an image, and a processing unit.
2. Discussion of Background Art
Several methods for segmenting a composite image, such as a document including text and figures, to identify fields corresponding to layout elements, are known in the art, and a common approach is based on processing the background. The image is represented by pixels that have a value representing the intensity and/or color of a picture element. This value is classified as background (usually white) or foreground (usually black, being printed space). The white background space that surrounds the printed regions on a page is analyzed.
A method for page segmentation is known from the article “Image Segmentation by Shape-Directed Covers” by H. S. Baird et. al. in Proceedings 10th International Conference on Pattern Recognition, Atlantic City, N.Y., June 1990, pp. 820-825. According to this method, in an image to be analyzed, a set of maximal rectangles of background pixels is constructed, a maximal rectangle being a rectangle that cannot be enlarged without including a foreground pixel. Segmentation of the image into information-bearing fields, i.e. text columns, is achieved by covering the total image with a reduced set of the maximal rectangles. The remaining ‘uncovered’ area is considered foreground and may be used for further analysis. A problem of this method is that the fields are defined as areas in the pixel domain, which does not allow computationally efficient further processing.
U.S. Pat. No. 6,470,095 discloses a method of page segmentation in which text areas are first preprocessed in a number of processing steps, to construct closed areas, called “enclosure blobs”, of black pixels. In the remaining white spaces, bands of white space having a maximal length are constructed by suppressing bands of white space adjacent to a longer band. The final bands of white space, horizontal and vertical are then replaced by their midlines. Finally, the junctions between horizontal and vertical midlines are detected, and loose ends are cut off. The remaining midline sections are used as delimiters of text fields. This known method involves a large number of processing steps and may in some instances give inaccurate results, when white spaces connect, but their midlines do not.
Another method for page segmentation is known from the article “Flexible page segmentation using the background” by A. Antonacopoulos and R. T Ritchings in Proceedings 12th International Conference on Pattern Recognition, Jerusalem, Israel, October 9-12, IEEE-CS Press, 1994, vol 2, pp. 339-344. According to this method, the background white space is covered with tiles, i.e. non-overlapping areas of background pixels.
The contour of a foreground field in the image is identified by tracing along the white tiles that encircle it, such that the inner borders of the tiles constitute the border of a field for further analysis. A problem of this method is that the borders of the fields are represented by a complex description which frustrates an efficient further analysis.