1. Field of the Invention
The present invention relates to an image processing technique of embedding watermark information in a document image based on the line spacing between the character strings in the document image.
2. Description of the Related Art
As a technique of adding information of, for example, copyright or copy control to a document image, a method described in Kineo Matsui, “Basics of a digital watermark”, Morikita Publishing Co., Ltd. pp. 198-199, 1998 (ISBN:4-627-82551-X) is known, in which information is embedded using line spacing (to be referred to as a line spacing watermark hereinafter). FIG. 3 is a view showing the concept of a line spacing watermark. To extract embedded information from a document image using a line spacing watermark, first, the line spacing between the character strings in the document image is obtained. To obtain the line spacing, generally, a histogram is obtained by fully scanning the document image. The line spacing is derived from the histogram. Then, information is extracted in accordance with the derived line spacing and a rule used for embedding. To embed, e.g., binary information “0”, line spacings U and D are set to U>D, as shown in FIG. 3. On the other hand, to embed binary information “1”, the line spacings U and D are set to U<D.
However, the above-described method of extracting information embedded in a document image using a line spacing watermark has the following problems. To measure the line spacing, it is necessary to fully scan the document image and obtain a histogram. Hence, the information extraction process is time-consuming. In particular, when copy control information is embedded, the copy control information is extracted in a copying machine, whether copying is possible is determined based on the extracted information, and then, a copy process is performed. The series of processes of copying one document takes a lot of time.
Additionally, a method of embedding information in a document image containing character strings, photos, and graphics is not described in the prior art. A method of extracting information from a mixed document image is not described either.