A watermark is an imperceptible or at least difficult to perceive signal embedded into multimedia content such as audio, video, text, still images, computer graphics, or software. The watermark conveys some useful information without disturbing or degrading the presentation of the content in a way that is noticeable or objectionable. Watermarking techniques play an important role in protecting copyright ownership of digital contents including images, audio, and video. Watermarks may be used to identify the original owner of the content, to trace where pirate copies of the content come from (fingerprinting), and to determine royalty payments by monitoring the number of times content has been used. Watermarks may also be used to authenticate original content and to locate change in a corrupted or altered copy of the content. In order to encourage copyright owners to use watermarking schemes, four basic and conflicting requirements should be met. Firstly, the distortion introduced by embedding the watermarks into content should be unperceivable by regular users. Secondly, the watermarks should be secure so that they are hard to be modified or removed by the pirates. Thirdly, the watermarks should be robust against intentional attacks, ranging from simple content manipulation such as cropping, to common image processing techniques, such as filtering and compression. Lastly, the overall cost of using watermarking should not be expensive.
Watermarking schemes may be categorized as non-oblivious or oblivious, depending on whether the original content is available or not. Oblivious watermarking may be defined as a watermarking scheme in which the original image is not available during watermarking decoding. Non-oblivious image watermarking schemes in general may be more robust due to the accessibility of the original image because image distortions caused by image processing, transmission, or intentional attacks may be compensated for using the original image. Also, the interference between the original image and the watermarks during watermark decoding may be removed by using the difference of the watermarked and original images. However, for many applications, such as copy and playback control, and copyright protection, the requirement of accessing the original image is simply not practical. This may make oblivious watermarking the only choice. Watermarks may be embedded in the pixel or the transform domains. Two papers which discuss and compare different methodologies and watermarking schemes include “A fair benchmark for image watermarking systems”, by M. Kutter and F. A. P. Petitcolas, (SPIE Electronic Imaging'99: Security and Watermarking of Multimedia Contents, vol. 3657, January 1999), and “Comparing robustness of watermarking techniques” by J. Fridrich and M. Goljan (SPIE Electronic Imaging'99: Security and Watermarking of Multimedia Contents, vol. 3657, January 1999). Proposed transforms include DCT, DFT, LOT, wavelets, Hadamard transform and key-dependent transforms. The watermark signal in a transform domain may usually be related to that in the pixel domain by a linear transformation, if the transform itself is linear. However, the analysis may be applied to pixel-based approaches as well. Human visual models have been used to adjust watermark strength so that embedded watermarks may be invisible. Spread-spectrum techniques are widely used by most oblivious watermarking approaches. When extracting the watermark message, these methods may rely on the watermark information embedded in the middle frequencies, although the noise-like watermark signal may also be embedded in the low and the high frequencies. The watermark information in the high frequencies may be easily removed using low-pass filtering and JPEG compression, and humans may be able to tolerate high distortion there. For low frequencies, watermark signals may have a high interference with the image itself. Note that the energy of a typical image may be concentrated in the lower frequencies.
For non-oblivious watermarking, adding watermarks in the low frequencies has been shown to have some advantages in a paper entitled “A review of watermarking and the importance of perceptual modeling”, by I. Cox and M. Miller, Proc. of the SPIE Human Vision and Electronic Imaging, vol. 3016, pp. 92-99, February 1997. More watermark messages may be sent while the noise level of the image does not increase. Watermarks in the low frequencies in general may be more robust than that in the middle frequencies, with respect to image distortions that have low-pass characteristics, such as filtering. Examples of nonlinear filtering, may include median filting, lossy compression filtering, and adaptive Wiener filtering. Watermarks in the low frequencies may also be less sensitive to small geometric distortions (e.g., rotation, shifting, and scaling). Therefore, seeking oblivious watermark schemes unitizing the low frequencies and distortion compensation techniques without the original image have become two active research topics.
Several watermark attack and counterattack methods have been proposed. To overcome a geometric attack, small blocks of a corrupted image may be registered with an original pseudo noise signal using correlation matching. Watermarks may also be removed by capturing watermark information pixel by pixel with a sensitivity attack if a pirate has access to a device that can detect whether the content contains a watermark or not.
To handle distortions without the original image, a calibration pattern may be embedded into the Fourier transform in the log-polar coordinates, so that the shift, scaling, s and rotation of the image may be compensated.
Some oblivious watermarking approaches using the low frequency bands have been proposed including embedding watermark information by swapping selected transform coefficients of 8×8 DCT blocks. The robustness of this type of approach may not be high and visible distortions may be introduced.
Another approach includes embedding watermark message bits into disjoint triplets of wavelet coefficients, which may be chosen according to a key-dependent random sequence. The middle coefficient may be quantized by a quantization step, what is equal to the difference of the largest and the smallest values of the triplet, divided by a fixed scale factor. This approach may not be applied to DCT coefficients since the standard deviation of the DCT coefficients in low frequencies may typically be very high. This requires a large fixed scale factor, or equivalently a small quantization step, in order for the watermark to remain invisible. Therefore, the robustness has to be compromised. Similar quantization techniques have been proposed to embed a cartoon or map image into a host image.
Quantization with frequency and spatial masking to embed watermarks into DCT 20 coefficients of 8×8 blocks has also been proposed. Watermarks using a small block size may not survive the distortions introduced by filtering with a large kernel. The suggested frequency masking model also becomes inaccurate for blocks larger than 16×16.
Yet another proposed method includes using the quantization index modulation to embed a watermark message into a host image. Message bits are used to select the pre-defined quantizers. Theoretical results for some channel models have been discussed. However, no experiments on real distortions have been reported.
Watermarks inserted in the middle and high frequencies may typically be very robust with respect to noise adding, nonlinear deformations of the gray scale, (e.g., contrast/brightness adjustment, gamma correction, histogram equalization), and cropping. Since these advantages are complementary to that of low-frequency techniques, and watermarks of low and middle frequencies are embedded into disjoint portions of the spectrum, Fridrich proposed to embed both low and high frequency watermarks into the image. To decode the hidden message in the low frequencies without the original image, binary mapping may be used. A (watermark) mapping function (also called an index function) may relate the watermarked transform coefficient to the watermark itself. Although it has been shown that the watermarks may be very robust to different types of distortion, there may be a serious security problem. The watermarks may be easily removed by clustering the DCT coefficients using a histogram attack, which may search for the parameters of the mapping function. If the intensity of some of the original pixels are guessed and if the basis function is known, then the watermarks may be estimated and the general watermark system fails. To overcome the histogram and the watermark-estimation attacks, it has been proposed that some secret key-dependent basis functions could be used. Although this scheme seems to be able to achieve better robustness to different distortions and high security to attacks, it may require very high computation and a relatively large amount of storage to generate the basis functions and to find the corresponding transform coefficients.
We will now discuss briefly an oblivious watermark approach, described in a paper by J. Fridrich, entitled “Combining low-frequency and spread spectrum watermarking”, Proc. SPIE Int. Symp. on Optical Science, Engineering Instrumentation, San Diego, July 1998, which uses a binary mapping function. A security problem will be disclosed using a histogram attack.
The oblivious low frequency watermarking of Fridrich is described as follows. Let fo(pj) be the intensity of an image at a j-th pixel pj=[xj, yj]T,jε{tilde over (J)}, where {tilde over (J)}={j|j=0,1, . . . , np−1} consisting of the index of all np pixels in a raster scan order. FIG. 9A shows some raster scan orders that may be used. The present invention may be practiced with any scan order, several of which are shown in FIGS. 9A, 9B, 9C, and 9D. Let m(fo) and σ2(fo) be the sample mean and variance of fo. The image may be normalized by the following transform so that its sample mean becomes zero and its coefficients of discrete cosine transform (DCT) may fall into a pre-specified range.                               f          ⁡                      (                          p              j                        )                          =                              1024                                          n                p                                              ⁢                                           ⁢                                                                      f                  o                                ⁡                                  (                                      p                    i                                    )                                            -                              m                ⁡                                  (                                      f                    o                                    )                                                                    σ              ⁡                              (                                  f                  o                                )                                                                        (        1        )            Denoting the original and watermarked DCT coefficients off as vi and vi′, let iεĨ where Ĩ={i|i=0,1, . . . , nw−1} consists of the index of DCT coefficients in a zig-zag order. Then a binary watermark sequence wi,εĨ, wiε{−1, 1} may be embedded to f by adjusting the amplitude of vi′, so that the distortion between vi and vi′ is minimum andwi=M0(|vi′|)  (2) where the mapping function                                                         M              0                        ⁡                          (                              v                ′                            )                                =                                                                                          (                                          -                      1                                        )                                    l                                ⁢                                                                   ⁢                if                ⁢                                                                   ⁢                                  v                  ′                                            ∈                                                           ⁢                              I                                  1                  ⁢                  l                                                      =                          [                                                a                  l                                ,                                  a                                      l                    +                    1                                                              )                                      ,                  a          =                                                    1                +                α                                            1                -                α                                      >            1                          ,                  α          >          0                                    (        3        )            If vi<1, vi′=vi. The above mapping function is called an index function. It can be shown that the maximum difference between vi and vi′ is less than |vi|α. In order to maximize the robustness with respect to image distortions, vi′ is chosen to be the middle point of interval I1l. To survive some common lossy compression and low-pass filtering, the watermarks may be embedded in the perceptually significant frequency bands with high energy, and the amount of change of different transform coefficients may be proportional to the amplitude of the coefficient itself. The watermark encoding and decoding may be simplified if they are performed in a log-magnitude domain. Let ui=1n|vi|, ui′=1n|vi′| and β=1n α. The 1-th interval in the log domain may be denoted by I21=[lβ, (l+1)β). The index of the interval where u is located may be determined by a locating function       l    ⁡          (      u      )        =            ⌊              u        β            ⌋        .  The watermark may be generated by the following mapping function                                                                         w                i                            =                            ⁢                                                M                  1                                ⁡                                  (                                      u                    i                    ′                                    )                                                                                                        =                            ⁢                                                                    (                                          -                      1                                        )                                                        l                    1                                                  ⁢                                  (                                      u                    i                    ′                                    )                                                                                        (        4        )            and assign ui′=q(ui′), where q(ui)=(l(ui)+0.5)β is the quantization function. More specifically, if (−1)(ui)=wi, then ui′=q(ui). Otherwise, ui′ may be equal to either q(ui)+β, or q(ui)−β, depending on which is closer to ui.
During watermark decoding, the watermark may be estimated from the received DCT coefficient ui″ as ŵi=M1(ui″), whereui″=ui′+ni  (5) and ni is the noise. Then the watermark sequence may be determined by using the following correlation function                     corr        =                              max                          s              ∈                              (                                  1                  -                                                            Δ                      β                                        ⁢                    1                                    +                                      Δ                    β                                                  )                                              ⁢                                                    ∑                                  i                  ∈                                      I                    ~                                                                                                                 ⁢                                                                                                              v                      i                      ′                                                                            γ                                ⁢                                                                            w                      ^                                        i                                    ⁡                                      (                                                                                            s                                                                                                      v                            i                            ′                                                                                                                )                                                  ⁢                                  w                  i                                                                                    ∑                                  i                  ∈                                      I                    ~                                                                                                                 ⁢                                                                                      v                    i                    ′                                                                    γ                                                                        (        6        )            where the scale factor s may be used to compensate the change of variance due to image distortions, and the weighting factor γ may be used to reduce the effect of small coefficients. The values Δs=¼ and γ=1 will be used in the disclosure of the present invention.
Combining with mid-frequency watermarking using the spread spectrum technique, the above binary watermarking has been shown by Fredrich to be robust for many attacks. However there arises a serious security problem. Since the watermarked coefficients ui′ are always located in the middle of the quantization intervals with a fixed size, a pirate may search for the correct quantization step using a histogram attack. Once the quantization step is found, the watermarks may be modified or removed. The histogram may be formed from the quantized DCT coefficients with a guessed quantization step size. For the correct step size, a peak will be present in the middle of the quantization interval.
Fridrich has observed this problem. He also discussed the security problem faced under the watermark-estimation attack. If the original intensity of some pixels of a watermarked image can be guessed, then the watermarks may be estimated and removed by solving a system of linear equations. To address both security problems, Fredrich proposed the use of key-dependent basis functions. He also demonstrated that his approach was quite robust to common distortions. However, his approach requires a high computation to generate the transform functions and to perform the forward or inverse transforms. To provide an alternative, the present invention will disclose a new class of mapping functions, which may require only simple operations. These mapping functions may be controlled by a secret key. To combat the watermark-estimation attacks, some counter-attacks will also be disclosed.
What is needed is a simple and effective scheme to enhance the security and robustness of a low-frequency watermarking scheme that protects the watermarks by using a secret (watermark) mapping function instead of a secret transform basis function. The scheme should also reduce the interference between the watermarks and the image itself by using a key-dependent quantization function. The scheme should also be generalized so that it may be applied to pixel-domain watermarking schemes. To combat the watermark-estimation attack, a simple counterattack is also needed that that the use of key-dependent basis functions isn't needed.