1. Field of the Invention
The present invention relates to a document processing apparatus and a document processing method, and in particular relates to a document processing apparatus that determines whether or not a document image has watermark information embedded therein by the use of character spacing, and a document processing method therefor.
2. Description of the Related Art
In order to invisibly include information such as copyright notices or copy restrictions in a document image, methods for embedding information by slightly changing character spacing that is a blank length between characters have been well-known (e.g. Japanese Patent Laid-Open No. 2002-232679). Hereinafter, such information that has been embedded by the use of character spacing is referred to as a character-spacing watermark.
As an example of information embedding rules using character spacing, for example, a rectangle circumscribing each character and a single pair of distances between such circumscribed rectangles, that is, character spacing values P and S, are first sequentially extracted, and then in each pair, either “0” or “1” is defined according to the sizes of the character spaces. Such a data string of “1”s and “0”s represents a character-spacing watermark.
Such a character-spacing watermark embedded in a document image is extracted as follows. First, a rectangle circumscribing each character and a pair of distances between such circumscribed rectangles, that is, character spacing values P and S, are extracted, and then, the high and low of the character spacing values P and S are determined for each pair so that whether the information is “0” or “1” is determined according to the rules used at the time of embedding. Thereafter, a data string connecting such obtained information of “0”s and “1”s is verified so as to first determine the presence or absence of watermark information, and when it is determined that watermark information is present, the information is extracted.
However, in the above-described conventional character-spacing watermark embedding method, the presence or absence of watermark information can be determined only after the process for extracting a data string of “0”s and “1”s has been completed; therefore, it takes considerable processing time to determine the presence or absence of watermark information.
For example, in the case where a copying machine controls copy permission based on watermark information, it is required to determine the presence or absence of watermark information in as short a time as possible in order to prevent a delay from occurring in a series of copy operations.