1. Field of the Invention
The present invention relates generally to watermarking of digital images and, more particularly, to methods for inserting/detecting a watermark into/from digital data that are resilient to rotation, scale and/or translation of the images.
2. Prior Art
There has been much emphasis on the robustness of watermarks to common signal processing operations such as compression and signal filtering. However, recently it has become clear that even very small geometric distortions can prevent the detection of a watermark. This problem is most pronounced when the original unwatermarked image is unavailable to the detector. Conversely, if the original image is available to the detector, then the watermarked image can often be registered to the original and thereby inverted. A public watermark requires that detection of the watermark be performed without access to the original unwatermarked image. As such, it is not possible to invert the geometric distortion based on registration of the watermarked and original images.
Before proceeding further, it is important to define what is meant by the geometric distortions of rotation, scale and translation. Specifically, we are interested in the situation in which a watermarked image undergoes an unknown rotation, scale and/or translation prior to the detection of the watermark. The detector should detect the watermark if it is present. This definition is somewhat obvious, so it may be more useful to describe what we are not interested in. In particular, some watermark algorithms claim robustness to scale changes by first embedding a watermark at a canonical scale, then changing the size of the image and finally, at the detector, scaling the image back to the canonical size prior to correlation. In our opinion, the detector does not see a scale change. Rather, the process is more closely approximated by a low pass filtering operation that occurs when the image is reduced in size. Similarly, tests that rotate an image by some number of degrees and subsequently rotate the image by the same amount in the opposite direction are not adequate tests of robustness to rotation. The same is true for translation. The common situation we are concerned with occurs when a watermarked image is printed and then cropped or padded and scanned back into the digital domain. In these circumstances, the image dimensions have changed both because of cropping and possibly scaling. There is also likely to be an associated translational shift. In this example, scaling to a canonical size does not undo the scaling. Rather, if the cropping is not symmetric in both the rows and columns, then scaling to a canonical size will result in a change in the aspect ratio of the image. Changes in aspect ratio are not addressed in this paper.
One strategy known in the art for detecting watermarks after geometric distortion is to try to identify what the distortions were, and invert them before applying the watermark detector. Methods have been developed in the prior art to accomplish this by embedding a registration pattern along with the watermark.
One problem with this solution is that it requires the insertion and detection of two watermarks, one for registration and one to carry the data payload. Thus, this approach is more likely to reduce the image fidelity. A second problem arises because all images watermarked with this method will share a common registration watermark. This fact may improve collusion attempts to discern the registration pattern and, once found, the registration pattern could be removed from all watermarked images thus restricting the invertibility of any geometric distortions.
Another way to implement the above strategy is to give the watermark a recognizable structure. For example, the watermark might be encoded with a small, rectangular pattern, and embedded several times in the image in a tiled grid. Then, regardless of the watermark pattern, the grid structure can be recognized by looking at the autocorrelation function of the image, which would contain a corresponding grid of peaks. These peaks can be analyzed to identify any affine distortions.
The methods of the present invention apply an alternative strategy, which is based on developing a watermark that is invariant to geometric distortions, so that there is no need to identify and invert them. In particular, the methods of the present invention are concerned with distortions due to rotation, scale, and/or translation (RST). While these geometric distortions have recently become of interest to the watermarking community, they have long been of interest to the pattern recognition community. A comprehensive discussion of the pattern recognition literature is well known in the art and as such are not discussed herein. Pattern recognition methods describe the use of moment invariants for visual pattern recognition of planar geometric figures. It has been shown that these classic moment invariants are equivalent to the radial moments of circular-harmonic functions (CHF's) that arise from a Mellin transform of the log-polar representation of an image when the complex Mellin radial frequency s, is a real integer s.gtoreq.1.
The Fourier-Mellin transform is closely related to the algorithm described in these pattern recognition methods of the prior art. There are a variety of related ideas from pattern recognition. First, the signal-to-noise ratio of the correlation peak between two images decreases from 30 db to 3 dB with either a 2% scale change or a 3.5.degree. rotation. Thus, some have proposed what is essentially a hybrid opto-electronic implementation of the Fourier-Mellin transform. Others have discussed implementation issues related to the discrete Fourier-Mellin transform. These include interpolation, aliasing, and spectral border effects, which are discussed in detail below.
Still others have described a conformal-log mapping that is very closely related to the Fourier-Mellin transform. And still yet, others have discussed the use of the Fourier-Mellin and other transforms for pattern recognition.
These methods discuss a number of absolute or strong invariants based on the phase of the Fourier or Fourier-Mellin spectrums. The terms "absolute" and "strong" refer to the fact that all information about an image except that of position, orientation or scale is preserved. This may be important for recognition tasks, especially if the library of objects is large. While some of those in the art discuss this issue in more detail, we do not believe that strong invariants are necessary for watermarking applications.
It is important to realize that watermark detection is different from the general problem of recognizing an object. First, an N-bit watermark only requires recognition of N independent patterns. Since N is typically between 32 and 64, this is considerably smaller than a practical object recognition database. Second, the watermark is not a naturally occurring object but is artificially inserted into an image. As such, the watermark can be designed to be easily represented. In particular, it is often advantageous to represent the watermark as a one-dimensional projection of the image space. If properly designed this has the benefit of reducing a two-dimensional search to one dimension, thereby significantly reducing the computational cost. Finally, since the set of watermarks is small (compared with the number of naturally occurring objects in a scene) and artificially created, it is not necessary that the image transform be strongly invariant as it is not as important to be able to reconstruct the image modulo rotation, scale, and/or translation from the parameterization.
There are those in the art who first suggested a watermarking method based on the Fourier-Mellin transform. However, they note very severe implementation difficulties which has likely hampered further work in this area. They choose to use a transformation that is strongly invariant claiming that it is more convenient to use strong invariants because the last stage of embedding a mark involves inverting the invariant representation to obtain the (marked) watermarked image.
We believe that invertibility is not essential. Following the formulation of those who first suggested that watermarking be viewed as communications with side information at the transmitter, suppose we have a non-invertible extraction function, X(C), that maps a piece of media, C, into an extracted signal. Such a function would be used as part of a detection strategy. We can often define an embedding function, Y(w,C), which finds a new piece of media, C.sub.w =Y(w,C.sub.0)$, such that EQU X(C.sub.w)=X(Y(w,C.sub.0)) (1)
and X(C.sub.0) is approximately equal to w, and C.sub.w is perceptually similar to C.sub.0. In other words, the watermarked image looks like the original and the vector extracted during detection looks like the watermark vector. This function is sufficient for use in a watermark embedder.
There have been a number of other recent watermarking algorithms designed to deal with geometric distortions. Of particular note is the recent work which describes an algorithm based on the detection of salient features in an image and the insertion of signals relative to these salient features. Experimental results indicate that the method is robust to mirror reflection and rotation but fails to survive other distortions. A somewhat related set of methods have been described by others. These methods are based on geometrically warping local regions of an image onto a set of random lines. However, currently, these methods are not robust to geometric distortions, but rather, allow for a rapid, but exhaustive search through the possible set of geometric distortions.