1. Field of the Invention
The present invention relates to apparatuses which generate vector information on the basis of character images or linear-drawing images included in drawings or document images and methods therefor.
2. Description of the Related Art
In recent years, there has been a growing need for paperless environments, and therefore, existing image data has often been digitalized to be reused. A method for reuse of image data has been proposed in which the image data is subjected to binarizing processing and is converted into vector data using a vectorizing technique, and the vector data is utilized in CAD software or the like.
For example, U.S. Pat. No. 6,404,921 (JP Patent No. 3026592) discloses a method for vectorizing a binary image. Specifically, U.S. Pat. No. 6,404,921 discloses “a contour extraction method for raster-scanning an input image in a pixel matrix, and in accordance with a state of plural pixels in the pixel matrix, extracting a contour vector located at a boundary between a black pixel, wherein the contour vector is defined by a loop of connected vectors, wherein the coordinate of the extracted contour vector is registered in a first table and, when a vector flowing into or out of the contour vector is undetermined in the case that the state of the pixels represents a start portion in the main scanning or the sub-scanning, the vector is registered in a second table, and wherein the coordinate of the extracted contour vector is registered in the first table, and, when a vector flowing into or out of the contour vector by searching through the second table in the case that the state of the pixels represents an end portion in the main scanning or the sub-scanning, the vector is registered in the first table”. U.S. Pat. No. 6,404,921 shows an effect in which, since all contours in the image are extracted in one raster-scanning and image memory for storing all image data is not required, memory capacity can be reduced.
U.S. Pat. No. 5,878,161 (JP Patent No. 3049672) discloses an image processing apparatus which obtains a high-quality image which has been subjected to magnification-varying processing using contour data of a binary image. U.S. Pat. No. 5,878,161 (JP Patent No. 3049672) discloses a technique in which an outline vector is extracted from a binary image and the extracted outline vector is subjected to magnification-varying processing to obtain an outline vector with desired magnification whereby a high-quality digital binary image with a desired magnification is obtained.
Contour data of a binary image may be generated by performing function-approximation processing using lines and Bezier curves.
In the techniques of generating outline vectors on the basis of contours disclosed in, for example, U.S. Pat. No. 6,404,921 and U.S. Pat. No. 5,878,161, vector data representing a contour of a line drawing including a line and a curve is also generated. Such contour vector data is generated as loop-shaped vector data generated in accordance with a boundary (a contour) between black pixels and white pixels. That is, the contour vector data is suitably used when the drawing is reused as is or when the drawing is reused as is after being subjected to only magnification-varying processing.
Meanwhile, there may be a situation in which the drawing is reused, for example, after only changing the thickness of lines or curves or by only changing the length or curvature of the drawing. However, although contour vector data representing a contour of the drawing is a suitable form for performing magnification-varying processing on the whole drawing while keeping the shape as is, contour vector data is not a suitable form when only the thickness or the length of the drawing is changed. As described above, when only the thickness is changed, for example, linear drawings such as lines and curves may preferably be represented using a linear vector indicating a center line and line-thickness data, instead of contour vector data.
However, in the related art, such a technique in which vectorizing methods are suitably changed has not been considered.
Furthermore, the size of characters included in an image is not fixed even in business documents, and therefore, characters having different fonts and different sizes may appear in a document. If similar vectorizing methods are applied to all of the characters having different fonts and different sizes, desired character forms may not be obtained and the amount of data is increased considerably.