The present invention relates to authenticating digital data, especially digital images, and, in particular, relates to authenticating such data by means of embedded code that functions to authenticate the data in a manner similar to that of a watermark in a piece of paper.
Digital images, digital video, soundtracks, and any other documents or files in an electronic format can be easily copied. Even though such copying may violate copyright laws, it is widespread. The ease with which electronic files may be copied without loss of content significantly contributes to illegal copying.
Furthermore, a recipient of even a lawfully transmitted digital document may need to authenticate it. That is, the recipient must determine that the document came from the person who is alleged to have sent it and not from someone trying to masquerade as that person. (Such falsification of the sender's identity is known as spoofing.) One can authenticate a communication by classical cryptographic means, such as combining a public-key system with a hash function.
With digital watermarking one can distribute a document without encryption. Thus digital watermarking is especially useful for protecting property rights in images, video-streams, and audio tracks. The watermark is embedded in the image and not just appended appended to the document, as a cryptographic signature is. The watermark stays in the document as long as the document is recognizable. Therefore one can prove that an image, whether still or moving, or a phonorecord is an unauthorized copy or derivative work of a copyrighted work.
These problems of preserving the rights of authors and of assuring recipients that they have not been spoofed raise the question of authenticating digital data. Can a method be found to sign or mark the electronic document to indicate its origin unambiguously? Can such a sign or mark be protected against tampering?
An electronic document can be signed or marked either by public-key encryption or digital watermarks. The present invention is directed to methods of authentication based on digital watermarks. Such a watermark must be embedded in the document in an invisible form. That is, it should not interfere with the content of the original, and adding a watermark to a document should not cause any visible artifacts to appear.
The most important requirement for digital watermarks is that they be robust with respect to common procedures of image processing. Such procedures include kernel filtering, lossy compression, conversion between digital and analog format, resampling, cropping, affine transforming, removing or adding features, adding noise, copying, and scanning.
The second important requirement for digital watermarks is that they be non-removable even with a complete knowledge of the watermarking algorithm. This requirement can be met if the algorithm for embedding the watermark is made cryptographically strong by means of a secret key.
The purpose of a digital watermark is to prove the ownership of digital data. A pirate may crop some small portion of the image, adjust colors, brightness, change the resolution, and apply various filters. Such modifications will of course disturb the watermark. Nevertheless, so long as the original portion of the image is recognizable, the watermark should be detectable by using sophisticated algorithms. The detection itself is an algorithmic process that depends on a secret password chosen by the document's author.
To prove that a watermark is present in an image, one must show that a certain relationship among pixels (the reconstructed watermark) can occur by chance with an extremely low probability. It should be computationally infeasible to remove the watermark, even with complete knowledge of the watermarking algorithm, unless one has the secret password. In other words, breaking the watermark should be nearly impossible without the password even if one knows the algorithm. This principle, called after Kerckhoff, is commonly accepted in the field of cryptography.
A robust watermarking algorithm can settle a dispute about ownership of a digital document. Let A create a digital image and watermark it with his key. If B gets hold of the watermarked image to steal it (i.e., claim ownership), the best B can do is to embed his watermark into the image and claim that he can prove ownership. If the watermark is robust, A can prove his ownership to a judge or other authority because A can detect his watermark in the image marked by B, while B cannot do that for the image that A has. Since we trust that the watermark cannot be completely removed unless one modifies the image beyond recognition, A proves he owns the image. The watermark's robustness to change plays a crucial role in this dispute, because, if B can filter out the original watermark, the watermark is useless.
The best watermarking algorithms currently available are based on spread-spectrum techniques (R. C. Dixon., Spread Spectrum Systems with Commercial Applications (New York, Wiley, 1994)). Hartung and Girod ("Digital Watermarking of Raw and Compressed Video", Proc. European EOS/SPIE Symposium on Advanced Imaging and Network Technologies, Berlin, Germany, October 1996) describe methods for marking digital video MPEG-2 in a series of papers. Their method uses direct sequence encoding in the spatial domain. It is well known that any watermarking scheme can be interpreted as pattern overlain with a specific pattern. In Hartung and Girod's method, the pattern is a linear combination of basis functions that are orthogonal to typical images. While this method enables extraction of a watermark without the original image, it is less robust with respect to image modifications. Also, their technique is vulnerable to collusion attack--averaging several watermarked copies of the same document in a hope to "average out" the watermark.
W. Bender, D. Gruhl, and N. Morimoto ("Techniques for Data Hiding," 2420 Proc. SPIE 40 (1990)) describe techniques for hiding data in digital images and audio streams. Their patchwork method is a "stochastic" spread-spectrum technique in the spatial domain. No image escrow is needed. Their method appears to be vulnerable to the collusion attack. Another of their methods is called texture block coding. A small, random-looking area is copied into a different random-looking area of the image. This copying creates a correlation that will not be disturbed by any image-processing operation except cropping. If those areas contain an 8.times.8 square, the method will also be robust with respect to JPEG compression of any quality. No image escrow is needed. However, the watermark is easy to detect and remove. Another disadvantage is that it is image dependent, and not all images will have the required random looking areas.
Pitas and Kaskalis ("Signature Casting on Digital Images", Proc. IEEE Workshop on Nonlinear Signal and Image Processing, Neos Marmaras, Halkidiki, Greece, Jun. 20-22, 1995) have described a method for applying signatures to digital images. The pixels of a digital image are divided into two disjoint sets of the same size. Pixels in one set are offset by an amount k while the pixels in the other set are left untouched. Although the authors report no robustness study, the similarity of this technique to the patchwork of Bender et al. suggests that the method will have similar robustness.
Zhao and Koch ("Embedding Robust Labels Into Images For Copyright Protection," Proc. Int. Congr. on IPR for Specialized Information, Knowledge and New Technologies (Vienna, Austria, Aug. 21-25, 1995)) propose a watermark based on JPEG compression. One bit of information is embedded into middle frequencies of pseudo-randomly chosen 8.times.8 pixel blocks. In each block, a triple of frequencies obtained by discrete cosine transform is chosen out of 18 predetermined frequencies. Their coefficients are modified so that their mutual relationship encodes one bit of information. Since the 18 predetermined frequencies are chosen from the middle range, this method will be less robust and more vulnerable to noise than the method of Cox et al. described below.
Zhao and Koch ("Towards Robust and Hidden Image Copyright Labeling," Proc. IEEE Workshop on Nonlinear Signal and Image Processing (Neos Marmaras, Halkidiki, Greece, Jun. 20-22, 1995)) describe another method, designed for black and white images, where the relative frequencies of black and white pixels in pseudo-randomly selected 8.times.8 blocks encode one bit of information. One advantage of this scheme is that no image transformation is involved. On the other hand, the method is vulnerable to collusion of several watermarked images. To overcome this problem, Zhao and Koch propose to choose distributed blocks instead of square blocks. However, this choice makes the method much more sensitive to noise.
M. Kutter et al. (Digital signature of color images using amplitude modulation, SPIE-EI97 Proceedings) describe a method for digitally signing color images that uses amplitude modulation. The authors hide the signature in the blue channel of a color image because the human visual system is least sensitive to the blue channel. Their method is clearly more sensitive to noise than are spread-spectrum techniques.
Cox et al. (Secure Spread Spectrum Watermarking for Multimedia, NEC Research Institute, Technical Report 95-10) introduced an extremely robust watermark which is based on discrete cosine transform and modifying the low frequencies by a small amount. To recover the watermark, one needs the original image. The authors report that one can reliably extract the watermark from images after 5% JPEG compression! They also find their technique to be robust under resampling, dithering, cropping, and other common image manipulations. The watermark is also resistant to collusion attack (combining multiple watermarked documents to remove the watermark).
The authors make the watermark very robust by inserting it into the low frequencies. That is, they make use of the relative insensitivity of the human visual system to small, gradual changes in intensity. The watermark is a sequence of 1000 samples from a normal distribution with zero mean and unit variance, {w.sub.i }, encoded into 1000 lowest frequency coefficients {v.sub.i } of the discrete cosine transform using the formula EQU v'.sub.i =v.sub.i (1+.alpha.w.sub.i), i=1, . . . ,1000.
The constant .alpha. adjusts the magnitude of modifications. In Cox et al.'s experiments, .alpha. was chosen to be equal to 0.1. To recover the watermark from a modified image, the modified image is first transformed by a discrete cosine transform to obtain modified coefficients {v.sub.i *}. The watermark w* extracted from the modified image is compared to the original watermark w with a similarity classification function ##EQU1## A conclusion whether or not the modified image contains the watermark w is made based on the value of sim. The authors describe several improvements that make the watermark extraction process more accurate by using robust statistics and by preprocessing w* before calculating sim. The watermark is remarkably robust with respect to analog-digital conversions, requantizing, copying and subsequent scanning, dithering, etc. In a more general scheme, instead of choosing the lowest 1000 frequencies for the watermark embedding, the frequencies are chosen from M lowest frequencies, where M&gt;N. To wipe out the watermark, one would have to randomize the amplitudes of all low frequencies by the maximum amount allowed by the algorithm. The result, however, would be visible deterioration of the image. The authors study the robustness with respect to collusion by averaging 5 watermarked images and testing the presence of each watermark in the image.
The watermark is equivalent to overlaying a pattern spanned by N discrete cosines over the image. The watermark values are used directly as coefficients of that linear combination. The watermark may become visible (or at least detectable) in those areas of the carrier image which were originally uniform or had a uniform brightness gradient. (Quite a large percentage of images do contain such areas).
The watermark cannot be readily removed because the discrete cosines do not generally form a set of linearly independent functions on proper subsets of the image. Nevertheless, one can mount an attack on the watermark. Let us assume that a square area containing K pixels reveals some approximation to the watermark. Since we know that the watermark is spanned by the lowest 1000 coefficients, we can write K equations and thereby narrow down the possibilities for the watermark sequence by a large margin. In the equation below, f.sub.r =(f.sub.1r,f.sub.2r) denotes the rth 2d frequency, and A.sub.r denotes the rth unknown coefficient. ##EQU2##
This information may be utilized to remove the watermark beyond detection.
Because none of these prior-art methods is foolproof, there exists a need for a digital watermark that is completely resistant to attack.