A watermark is a special-purpose low-level identifier signal added to another content signal, typically for the purpose of digital rights management, authentication, identification and tagging, source location, steganographic data or command hiding, broadcast tracking, tamper detection. Thus, watermarking is the embedding of identifier information in content information signals, and the like. The “carrier” signal or content signal may be audio, an image, video, enriched text data, and the like. Usually, the watermark is designed to be essentially imperceptible, but so that it may be detected reliably by signal processing.
A typical method to detect a watermark is through use of a matched filter or correlation detector. Given a watermarked signal, samples are taken of the watermarked signal and every such sample is multiplied by a corresponding sample of the watermark pattern. The sum the products of the samples are then taken. If the sum of products is large, the watermark pattern is detected.
Previous attempts have been made to create robust watermarks for copy protection of audio-visual data that are resistant to efforts to overcome such watermarks and copy the accompanying data. Some of these attempts involve the use of log-polar coordinates and Fourier and Mellin transforms for image processing and registration. Data can be converted from normal Cartesian coordinates to log-polar coordinates through a known algorithm, which typically requires coordinate transformation from normal Cartesian coordinates (x, y) to polar coordinates (R, θ) and then to log-polar coordinates (L, θ) by taking the log of the radius after the coordinates have been converted from Cartesian to polar.
Unfortunately, these existing watermarking systems have various drawbacks. First, they require that the watermark be embedded in a particular transform domain for it to be resistant to geometric transformations. This limits the flexibility in the design of the watermark, and so these techniques cannot be incorporated into a previously designed watermarking system to improve it.
Second, these watermarking systems have been criticized for only being robust to geometric transformations, but not to other attacks such as noise addition. The technical reason is that these systems obtain geometric robustness by embedding the watermark in the magnitude of a Fourier transform. This transform magnitude is invariant to spatial shifts in the input to the transform, but it is easy to modify and attack. On the other hand, it is well known in image processing that the phase of the Fourier transform of an image contains most of the information in the image. It is possible to completely change the magnitude of the transform, inverse transform the magnitude and phase, and still see much of the content of the original image.
Other approaches embed simple patterns, or complete watermarks, at known positions in images or video frames, and then detect these patterns or watermarks and their positions to compute and account for any rotation, resizing, or other geometric alteration. A system that uses these approaches may not require any frequency transforms or log-polar mappings. Thus, the general idea appears attractive. Also, helper patterns or watermarks can be added to a pre-existing watermark, as long as they do not interfere. In this way, geometric robustness can be added to a pre-existing watermark system.
Unfortunately, such approaches also have drawbacks. For example, there is a tradeoff between robustness and ease of detection of the helper patterns or watermarks. A simple helper pattern may be easy to detect even after it has been geometrically altered—this makes geometric robustness for the main watermark easy to attain, but also makes the helper pattern easy to find and attack. Then, the robustness of the main watermark to geometric manipulations is defeated.
On the other hand, the helper patterns may actually be watermarks, which are harder to remove, but also much harder to detect. If the helper watermarks are arbitrary, exhaustive searches may be required over some geometric parameters (such as the angle of rotation). In a sense, such systems merely shift the problem of robustness from the parent watermark to the helper patterns or watermarks.
If any geometric modification of image or video frames occurs after the watermark embedder, an unassisted watermark detector or decoder will not correctly discern the watermark. The most common such modifications are resizing, spatial translation or shift, rotation, shear, and cropping. Combinations of the first four types produce all so-called affine geometric transformations. Cropping can be used to trim an image to a particular size after an affine transformation, and to avoid blank areas near the edges of the image frame. These modifications can occur as part of normal processing, such as video format conversion, or a hacker wishing to defeat the watermark may apply them maliciously.
In particular, in classical matched filter detectors, if an input frame is geometrically altered, then the embedded watermark will not align with the “reference” watermark pattern known to the detector, and the detector may fail.
One solution for increasing the efficiency and efficacy of watermark embedding and detection is through the use of a wavelet domain for embedding watermarks, such as described, for example, in U.S. Pat. No. 5,930,369 to Cox, et al., and entitled SECURE SPREAD SPECTRUM WATERMARKING FOR MULTIMEDIA DATA. However, even watermarks embedded in a wavelet domain may have inadequate robustness against geometric alterations to watermarked data. This and other difficulties created by wavelet domain watermark embedding are discussed below.
Generally speaking, a wavelet or sub-band transform divides an image into spatial frequency bands (Generally, wavelets use octave bands, whereas sub-band transforms can have almost arbitrary band divisions). The low-frequency bands contain smooth areas and large shapes from the image, whereas the high-frequency bands contain detail such as lines, edges, and small spots. The human eye analyzes scenes in a similar way. Therefore, wavelets and sub-bands provide control when embedding a watermark into images and video, since they provide flexibility in embedding the watermark to maximize robustness or payload, while typically minimizing visibility of the watermark once embedded.
However, embedding a watermark in a frequency band is similar to frequency-shifting the watermark to the center frequency of a band. This can make the original pattern highly oscillatory, and, unfortunately, this makes geometric robustness more difficult to attain.
In the inventor's prior U.S. patent application Ser. No. 09/802,244, entitled METHOD TO DETECT WATERMARK RESISTANT TO RESIZING AND TRANSLATION, filed on Mar. 8, 2001, assigned to the assignee of the present application and incorporated by reference in its entirety in the present application, one solution to the above-described problem is provided. In particular, the method of that application provides that the watermark patterns are added directly to the image or video pixels. As a result, geometric robustness depends on the simple structure of the patterns.
In addition, in the inventor's prior U.S. patent application Ser. No. 10/162,838, entitled METHOD AND APPARATUS TO DETECT WATERMARK THAT ARE RESISTANT TO RESIZING, ROTATION AND TRANSLATION, filed on Jun. 5, 2002, assigned to the assignee of the present application and incorporated by reference in its entirety in the present application, a further robustness solution is described. Specifically, that application provides for robustness of a watermark embedded in data that was subject to varying combinations of resizing, shift, and rotation. However, the solution of that application is incomplete with respect to broader geometric transformations such as general affine geometric transformations, including, for example, shear.
Following these methods, in order to detect embedded watermarks, filtering of watermarked data is employed. Then, Fourier transforms are used to extract a repetitive “grid” of blocks in which the watermark to be extracted is embedded. The mathematics to compute the size, position, and orientation coordinates of the grid are computed from the locations and phases of the significant peaks in the summed output of the Fourier transforms. From the coordinates of the grid recovered from the transformed watermark data, the original watermark signal can thus be obtained.
A desirable feature of these prior methods is that the geometry estimation for the repetitive “grid” of blocks is independent of the payload embedded in the watermark itself. Otherwise, a more difficult problem might result, wherein the watermark pattern and the geometric parameters might have to be estimated jointly.