1. Field of the Invention
The present invention relates to a document processing apparatus and a document processing method, and in particular relates to a document processing apparatus that determines whether or not a document image has watermark information embedded therein by the use of line spacing, and a document processing method therefor.
2. Description of the Related Art
In order to invisibly include information such as copyright notices or copy restrictions in a document image, methods for embedding information by slightly changing line spacing have been well-known (e.g., Kineo Matsui, “Fundamentals of Digital Watermarking-New Technology for Protection of Multimedia Contents,” Morikita Publishing Co., Ltd., p 198-p 199). Hereinafter, such information that has been embedded by the use of line spacing is referred to as a line-spacing watermark.
As an example of information embedding rules using line spacing, for example, two adjacent line spaces are sequentially set as a single pair, and in each pair, either “0” or “1” is defined according to the sizes of the line spaces. Such a data string of “1”s and “0”s represents a line-spacing watermark.
In one example of the method for extracting such a line-spacing watermark embedded in a document image, as a first step to extract line spaces, an entire document image is scanned so as to obtain a histogram in a sub-scanning direction and then line spaces are calculated based on this histogram. Thereafter, the sizes of the calculated line spacing values are determined for each pair so as to define whether the information is “0” or “1” according to the rules used at the time of embedding, then the presence or absence of watermark information is determined based on a data string of such information, and when it is determined that watermark information is present, the information is extracted.
Such a line-spacing watermark is used in a copying machine, for example. Specifically, line-spacing watermark information embedded in a document to be copied is extracted and whether or not copying is allowed is controlled according to the contents of the information.
However, in the above-described conventional line-spacing watermark embedding method, since line spacing is measured at the time of its extraction, the presence or absence of watermark information can be determined only after obtaining a histogram via a scan of an entire document image and then extracting a data string of “0”s and “1”s. For this reason, it takes considerable processing time to determine the presence or absence of watermark information.
For example, in the case where a copying machine controls copy permission based on watermark information, it is required to determine the presence or absence of watermark information in as short a time as possible in order to prevent a delay from occurring in a series of copy operations.