In recent years, with advance in information computerization, the spread of a system to generate computerized documents from paper documents and store or transmit the computerized documents is accelerated. Especially, to computerize a full-color document including a large amount of information and store/transmit the document, vector data, obtained by performing segmentation on a paper original to objects such as characters, tables and figures and generating electronic data in appropriate form to the respective objects, is desirable. The use of vector data reduces data amount, and further, increases recyclability.
Regarding the objects such as characters, tables and lines, the data amount of each object can be reduced by generating vector data (outline vector) by outline processing on the contour of the object thereby converting the object to a form represented with straight lines and curves. Further, a character can be converted to resolution-independent electronic data having high image quality, and a figure element such as a table or line can be converted to electronic data which can be easily handled in element-base editing.
In the above-described conversion to straight lines and curves, various methods have been studied. Most of the methods are first dividing point sequences representing a profile line into respective segments then converting the segments to curves or straight lines.
For example, a known method is once liner-approximating a profile line to represent the contour as a set of straight lines, then performing interpolation with B spline curves, Bezier curves and the like to replace the straight lines with smooth curves. As the linear approximation processing, segment dividing is generally known. Further, other methods are disclosed in Non-Patent Document 1: “Optimum Segment Approximation for Flat and Curved Surfaces” (SATO, the Institute of Electronics, Information and Communication Engineers Paper '82/9 Vol. J65-D No. 9) and Non-Patent Document 2: “Space-Efficient Outlines from Image Data via Vertex Minimization and Grid Constrains”, Graphical Models and Image Processing Vol. 59 No. 2, pp. 73-88 (1997).
Further, as a method for dividing a profile line, a method of dividing a profile line represented with fine short vectors based on angle from point of interest (e.g., Patent Document 1: Japanese Patent Application Laid-Open No. 3-48375) is disclosed. Note that as the interpolation in the method, Non-Patent Document 3: Wolfgang BOHM: A Survey of Curve and Surface and Geometric Design 1, 1984, is introduced.
Further, regarding an angular change of a curve in tangential line direction in a point sequence, a simplified method of dividing a curve and generating a point sequence dividing the curve, obtaining the ratio of continuous three points in their positions in approximation processing, thereby determining two curves (e.g., Patent Document 2: Japanese Patent Application Laid-Open No. 8-153191) is proposed.
Further, as the interpolation method, a method using meromorphic quadratic Bezier curve (e.g., Patent Document 3: Japanese Patent Application Laid-Open No. 6-282658) is proposed. Note that the dividing method is not disclosed.
On the other hand, regarding curve approximation processing, a method for approximation on a divided point sequence using nonlinear programming (e.g., Patent Document 4: Japanese Patent Application Laid-Open No. 7-85110) is proposed.
Further, approximation with plural functions for an arbitrarily divided point sequence using DP (Dynamic Programming) (e.g., Non-Patent Document 4: “Paper Document Digitization Using Function Figure Representation” (MORI Kohichi, WADA Kohichi and TORAICHI Kazuo, Information Processing Society of Japan Report Vol. 99, No. 57, pp. 17-23) is proposed.
The generation of outline is realized by using the above conventional methods, however, various problems occur when adaptive and high-speed processing is performed on all the objects obtained from a raster image. For example, in a case where curve dividing is performed by using the linear approximation processing, as the linear approximation performs optimization of error between straight line and point sequence, the amount of processing is increased due to repetitive computation if high accuracy is required.
Similarly, in a case where approximation is performed on a point sequence by using a curve, as optimization is performed on the error between the curve and the point sequence, the amount of processing is increased. On the other hand, in the case of simplified dividing using an angle between point sequences without repetitive computation, since the method is too simple, various profile lines from small to large profile lines cannot be handled without difficulty. Further, to realize optimum dividing without influence of noise or the like, processing for obtaining a point sequence (short vector) forming a curve prior to dividing is important. However, this method is not disclosed.
Further, a method of obtaining a characterizing point between anchor points by the curve approximation processing and easily obtaining an optimum curve based on the ratio of distances among three points is proposed. However, as the characterizing point is used as a new anchor point, and two curves are obtained from three points including the two end anchor points, the numbers of curves and points are increased.
Further, in all the conventional methods, since outline is uniformly evaluated, the number of points is increased when the size of outline is increased, and when the size of outline is extremely reduced, the accuracy of approximation is degraded. Note that any countermeasure against these problems is not disclosed.
Accordingly, in the conventional methods, the first problem is that the approximation processing is heavy. The second problem is that in the simplified approximation processing to address the first problem, the accuracy cannot be ensured or the reduction of data amount cannot be realized.
Further, the third problem is that, in all the conventional methods, since outline is uniformly evaluated, the accuracy cannot be ensured for the size of object to be subjected to outline processing, and the processing cannot be performed in a flexible manner in correspondence with change of size.