1. Field of the Invention
The present invention relates to an image processing technique. In more detail, the present invention relates to an image processing apparatus, an image processing method, a program, and a storage medium, for eliminating blurring in a boundary part of an illustration in a scanned image.
2. Description of the Related Art
Recently, the progress in information computerization has diffused a system which preserves a paper document by computerization with a scanner or the like without preserving the paper document as it is and transmits the electronic data to another apparatus. For the reduction of a transmission cost, the computerized document is required to have a high compressibility. Meanwhile, the compression of the electronic data is required to have reusability in which the electronic data can be partially edited and a high image quality capability without degrading image quality even when provided with enlargement or reduction.
However, when a character region and a photograph region are mixed in document data, compression suitable for the character region provides high image quality but a low compression ratio, and compression suitable for the photograph region provides a high compression ratio but has a problem of character image quality degradation (deterioration). Accordingly, a method has been proposed as follows. First, the computerized document data (document image) is separated into the character region and the photograph region, and the character region, where the reusability and high image quality capability are important, is converted into vector data. Then, the other region which is not easily vectorized such as the photograph region is compressed by JPEG and the compression results of the respective regions are synthesized and outputted. This method is proposed to realize the high compressibility, reusability, and high image quality capability for a document image (refer to Japanese Patent Laid-Open No. 2004-265384).
A method is proposed also for handling a graphic region (generally called illustration, clip art, or line art), which is composed of several uniform colors and has a feature of a clear outline, as well as the character region as an object of the vector processing (refer to Japanese Patent Laid-Open No. 2006-344069). The method proposed here reduces the number of colors of an input image by applying color similarity to the input image and subsequently performs functional approximation of an outline of each color region after the extraction thereof to output the vector data with color information added thereto.
For performing the vectorization of a scan image, while it is necessary to reduce a scan noise contained in the input image and to extract an outline of an original image, the number of colors needs to be reduced in preprocessing. When clustering is applied, after the number of colors is squeezed first to some extent by the clustering, a method of performing more accurate color separation has been used by combining clusters having similar colors with each other in post-processing. For example, Japanese Patent Laid-Open No. 2006-344069 proposes a method of eliminating a cluster having the number of pixels smaller than a predetermined threshold value in the result of the clustering.
The illustration sometimes includes a fine (thin) line art region and a strong edge component, and the image processing of such an image causes blurring around these regions and provides an unwanted effect in the succeeding image processing since a color generated by the blurring is different from the color of the input image. Further, originally unnecessary vector data is generated in the region where the blurring has occurred, and resultantly there arises a problem that a vector data size is increased and data reusability is degraded.
Further, a renderer accommodating a vector format sometimes has a lower limit of a line width to be rendered depending on a performance limit of the renderer. When an image containing a fine line-shaped noise is vectorized and outputted in a vector format having the lower limit of a line width to be rendered, there also arises a problem that the line-shaped noise is displayed thicker than it is causing a line to appear blurred and image quality appears to be degraded as shown in FIG. 4. For example, a part of one line image in an original document image is clustered into two parallel line-shaped regions if the part of one line image is scanned as an image including two little different colors, and vector data expressing two neighboring fine lines is generated when each of the line-shaped regions is vectorized. When the line width in this vector data is smaller than the lower limit of the line width to be rendered, the line width is reproduced and displayed to become larger by the rendering and thereby the part including the two lines appears to be thicker as blurred. When displayed in 100% as shown on the left side of FIG. 4, for example, the line width in the vector data is smaller than the line width which can be rendered and thereby the line is displayed as blurred. Note that, when the part appearing as blurred is enlarged in 400%, the part is enlarged to be displayed accurately according to the line width in the vector data and thereby the two lines in the vector data sometimes appear to be the original one line, as shown on the right side of FIG. 4.