In recent years, amid calls for environmental issues, moves towards paperless offices have been promoted, and various techniques that handle digital documents have been proposed.
For example, patent reference 1 (Japanese Patent Laid-Open No. 2001-358863) describes a technique for scanning a paper document by a scanner, converting the scanned data into a digital document format (e.g., JPEG, PDF, or the like), and storing the converted data in image storage means.
Patent reference 2 (Japanese Patent Laid-Open No. 8-147445) discloses a document management system which detects regions of respective properties contained in a document image, and manages the document as contents for respective regions.
Furthermore, patent reference 3 (Japanese Patent Laid-Open No. 10-285378) discloses the following technique. That is, in a digital multi-function peripheral (MFP) (comprising a copy function, scan function, print function, and the like), it is confirmed if a scanned image includes a graphic code indicating a page ID, and if the graphic code is found, a database is searched for the corresponding page ID. If the page ID is found in the database, the currently scanned image is discarded, print data associated with that page ID is read out, and a print image is generated and is printed on a paper sheet by a print operation. On the other hand, if no corresponding page ID is found in the database, the scanned image is directly copied onto a paper sheet in a copy mode, or a PDL command is appended to the scanned image to convert the scanned image into a PDL format, and the converted data is transmitted in a facsimile or filing mode.
With the technique of patent reference 1, an image scanned by the scanner can be saved as a JPEG file or PDF file with a compact information size. However, this technique cannot search for a saved file based on the printed document. Hence, when print and scan processes are repeated, the saved digital document image gradually deteriorates.
The technique of patent reference 2 divides an image into a plurality of regions and allows these regions to be re-usable for respective contents. However, the contents are searched on the basis of a user's instruction, and contents to be used are determined from the found contents. Hence, upon generating a document using the stored contents, the user must determine contents to be used, thus taking a lot of trouble.
With the technique of patent reference 3, if no original digital document corresponding to a paper document is found, a PDL command is appended to a scanned image to convert that image into a PDL format. However, when the PDL command is merely appended to the scanned image to convert that image into the PDL format, a relatively large file size is required.