Along with a recent growing interest in environmental issues, move to paperless offices has rapidly been promoted. For this purpose, there is conventionally known a document management system which reads paper documents accumulated in binders by using a scanner, converts the read images into portable document format (to be abbreviated as “PDF” hereinafter) data, and accumulates them in an image storage device as a database.
An image processing system has also been developed, which reads a paper document by using a scanner, extracts objects such as characters, tables, and illustrations on the document by executing image processing such as OCR (Optical Character Recognition) and edge extraction, and generates reusable vector data (e.g., Japanese Patent Application Laid-Open No. 5-342408).
In the above-described conventional image processing system to generate vector data, however, any batch process of a plurality of images has not been examined. For example, when document sheets each having a company's logotype are read, the same object repeatedly appears in the plurality of images. In storing such images in a memory as vector data, individually vectorizing and storing all objects such as a logotype that appears many times is not efficient from the viewpoint of utilization of the memory as a limited hardware resource. In addition, to reuse these data stored in the memory, even similar objects must individually be edited, resulting in cumbersome operation. Furthermore, the objects may be reconstructed as different due to conversion errors.