1. Field of the Invention
The present invention relates to an image processing technique.
2. Description of the Related Art
Recent progress of information digitization is spreading to systems that scan paper documents using a scanner or the like, and store the electronic data or send them to another apparatus, instead of storing a paper document as it is. These systems are required to have high compressibility of digitized documents to reduce the sending cost. Compressing electronic data requires high reusability to partially edit electronic data and high image quality to prevent any degradation in image quality in both enlargement and reduction.
However, if document data containing both a text area and a photo area undergoes compression suitable for the text area, the compression ratio is low although the image quality is high. To the contrary, if the data undergoes compression suitable for the photo area, the quality of characters degrades although the compression ratio is high. A method has been proposed to solve this problem. First, digitized document data (document image) is separated into a text area and a photo area. The text area that makes much account of usability and image quality is converted into vector data. The remaining areas such as the photo area which cannot easily be reproduced by vectorization are compressed by JPEG. The compression results of the respective areas are combined and output, thereby obtaining a document image with high compressibility, reusability, and image quality (Japanese Patent Laid-Open No. 2004-265384).
Conventionally, a means for vectorizing an image (to be referred to as a “line drawing image” hereinafter) including many lines separated from a document image has been proposed (Japanese Patent No. 3049672). There are proposed, for example, an outline vector method of binarizing an image and approximating coarse contours extracted from closed lines in the binary image, and a line art method of approximating the cores of closed lines.
Another method has been proposed which enlarges a vectorization target to improve the compressibility, reusability, and image quality of the document image (Japanese Patent Laid-Open No. 2006-344069). Still another proposed method vectorizes, in an image conventionally compressed as a photo by JPEG, a specific image (illustration) (to be referred to as a “clipart image” hereinafter) which is clear as it if had edges on the contours of an object and has only a limited number of appearing colors. The method proposed here divides an image into color areas based on the color similarity, approximates the contour of each color area by the outline method, adds color information, and outputs vector data.
However, when vectorizing image data, whether an input image is a clipart image or a line drawing image is determined so far based on the experiences of a developer, and the vectorization method to apply is thus switched. The conventional technique cannot automatically switch the vectorization method by automatically determining whether an input image is a clipart or a line drawing. Concerning a means for separating a document image into attribute areas such as a text area and a photo area, there are proposed many methods of determining whether a pixel is a text pixel or a photo pixel. However, these methods do not determine whether an overall document image to be processed is a clipart or a line drawing.
Various proposals have also been made regarding binary line drawing vectorization processing. For, for example, color line drawing vectorization processing, a method has been proposed which generates three single-color multilevel images of an input image, extracts the shapes of isodensity lines of each multilevel image, and vectorizes them (Japanese Patent Laid-Open No. 11-069145).
However, generation of single-color multilevel images of a color line drawing is largely affected by thresholds, and the method cannot deal with scan noise well.