In recent years, amid increased attention to environmental issues, move to paperless offices has rapidly been promoted. For this reason, there has conventionally been built a document management system which scans paper documents accumulated and stored in binders and the like with a scanner or the like, converts them into portable document format (to be abbreviated as “PDF” hereinafter) files, and stores them in an image storage device (database).
There is known a document management system which recognizes character information included in image data obtained by scanning a document and associates it with font data (see, e.g., patent reference 1: Japanese Patent Laid-Open No. 5-12402). This arrangement facilitates re-use and re-editing of a paper document and the like.
However, in patent reference 1, character information obtained by character recognition is associated with font data prepared in advance and is not completely faithful to character information described in a document.
To cope with this, there can be considered a technique of extracting the outline of character information included in scanned image data and obtaining outline data. However, outline conversion requires a complicated process, and the processing time and load may increase in executing outline conversion for a large number of characters.