1. Field of the Invention
The present invention relates to a technique for embedding information in a document.
2. Description of the Related Art
Techniques for embedding information in documents and then extracting the embedded information (referred to generally as “watermark information”) are useful in order to enhance the security of documents.
For example, watermark information is embedded in a document image by adjusting the positions at which character images are placed, and the resulting document image is output as a print document. The print document that has been output is captured by a scanner or the like, whereby the watermark information is extracted from the document image as a document image [see the specification of Japanese Patent Application Laid-Open No. 2005-253004 (Document 1)]. This method is such that resistance to copying is high because the positions at which the character images are placed are difficult to change by copying.
On the other hand, there is a technique in which watermark information is embedded by changing the layout information of electronic document data that includes text described in page description format, after which the watermark information is extracted from the electronic document data [see the specification of Japanese Patent Application Laid-Open No. 2000-99501 (Document 2)]. The electronic document data described in a page description format is a language for specifying output with respect to a printer. Characters and figures, etc., can be printed at an optimum character quality and image quality that conform to each printer. This is being utilized in ordinary laser printers. Since the method of Document 1 performs the embedding of watermark information in a document image, printing at optimum character quality and image quality cannot be achieved. Accordingly, the embedding of watermark information in electronic document data described in a page description format is believed to be necessary.
However, the technique described in Document 1 embeds watermark information in a document image and it is necessary to make a conversion from a document image to electronic document data in order to produce an output as electronic document data. When this conversion is made, block selection and optical character recognition (referred to as “OCR” below) are carried out. However, there are instances where character spacing varies owing to the influence of error ascribable to OCR or the effects of hinting processing for improving character quality at the time of rendering processing for outline fonts. As a consequence, since the watermark information is embedded in a document image and the electronic data is converted after the watermark information is embedded, there are instances where the embedded watermark information cannot be extracted correctly. In addition, the conversion involves a large amount of processing.
On the other hand, the technique described in Document 2 embeds watermark information in electronic document data. When this is output as a print document, therefore, there are many cases where watermark information identical with the watermark information that has been embedded in the electronic document data of the document image cannot be extracted.