The present invention relates to digital watermarking of data including image, video and multimedia data. Specifically, the invention relates to insertion and detection or extraction of embedded signals for purposes of watermarking, in which the insertion and detection procedures are applied to sums of subregions of the data. When these subregions correspond to the 8xc3x978 pixel blocks used for MPEG and JPEG compression and decompression, the watermarking procedure can be tightly coupled with these compression algorithms to achieve very significant savings in computation. The invention also relates to the insertion and detection of embedded signals for the purposes of watermarking, in which the watermarked data might have undergone distortion between the times of insertion and detection of the watermark.
The proliferation of digitized media such as image, video and multimedia is creating a need for a security system that facilitates the identification of the source of the material.
Content providers, i.e. owners of works in digital data form, have a need to embed signals into video/image/multimedia data, which can subsequently be detected by software, and/or hardware devices for purposes of authentication of copyright ownership, and copy control and management.
For example, a coded signal might be inserted in data to indicate that the data should not be copied. The embedded signal should preserve the image fidelity, be robust to common signal transformations and resistant to tampering. In addition, consideration must be given to the data rate that can be provided by the system, though current requirements are relatively lowxe2x80x94a few bits per frame.
In U.S. patent application Ser. No. 08/534,894, filed Sep. 28, 1995, entitled xe2x80x9cSecure Spread Spectrum Watermarking for Multimedia Dataxe2x80x9d, which is incorporated herein by reference, there was proposed a spread spectrum watermarking method which embedded a watermark signal into perceptually significant regions of an image for the purposes of identifying the content owner and/or possessor. A strength of this approach is that the watermark is very difficult to remove. In fact, this method only allows the watermark to be read if the original image or data is available for comparison. This is because the original spectrum of the watermark is shaped to that of the image through a non-linear multiplicative procedure, and this spectral shaping must be removed prior to detection by matched filtering. In addition, the watermark is usually inserted into the N largest spectral coefficients, the ranking of which is not preserved after watermarking. This method does not allow software and hardware devices to directly read embedded signals without access to the original unwatermarked material.
In an article by Cox et al., entitled xe2x80x9cSecured Spectrum Watermarking for Multimediaxe2x80x9d available at http://www.neci.nj.nec.com/tr/index.html (Technical Report No. 95-10) spread spectrum watermarking is described which embeds a pseudo-random noise sequence into the digital data for watermarking purposes.
The above prior art watermark extraction methodology requires the original image spectrum be subtracted from the watermark image spectrum. This restricts the use of the method when there is no original image or original image spectrum available to the decoder. One application where this presents a significant difficulty is for third party device providers desiring to read embedded information for operation or denying operation of such a device.
In U.S. Pat. No. 5,319,735 by R. D. Preuss et al entitled xe2x80x9cEmbedded Signalingxe2x80x9d digital information is encoded to produce a sequence of code symbols. The sequence of code symbols is embedded in an audio signal by generating a corresponding sequence of spread spectrum code signals representing the sequence of code symbols. The frequency components of the code signal being essentially confined to a preselected signaling band lying within the bandwidth of the audio signal and successive segments of the code signal corresponds to successive code symbols in the sequence. The audio signal is continuously frequency analyzed over a frequency band encompassing the signaling band and the code signal is dynamically filtered as a function of the analysis to provide a modified code signal with frequency component levels which are, at each time instant, essentially a preselected proportion of the levels of the audio signal frequency components in corresponding frequency ranges. The modified code signal and the audio signal are combined to provide a composite audio signal in which the digital information is embedded. This component audio signal is then recorded on a recording medium or is otherwise subjected to a transmission channel. Two key elements of this process are the spectral shaping and spectral equalization that occur at the insertion and extraction stages, respectively, thereby allowing the embedded signal to be extracted without access to the unwatermarked original data.
In U.S. patent application Ser. No. 08/708,331, filed Sep. 4, 1996, entitled xe2x80x9cA Spread Spectrum Watermark for Embedded Signalingxe2x80x9d by Cox, and incorporated herein by reference, there is described a method for extracting a watermark of embedded data from watermarked images or video without using an original or unwatermarked version of the data.
This method of watermarking an image or image data for embedded signaling requires that the DCT (discrete cosine transform) and its inverse of the entire image be computed. There are fast algorithms for computing the DCT in N log N time, where N is the number of pixels in the image. However, for N=512xc3x97512, the computational requirement is still high, particularly if the encoding and extracting processes must occur at video rates, i.e. 30 frames per second. This method requires approximately 30 times the computation needed for MPEG-II decompression.
One possible way to achieve real-time video watermarking is to only watermark every Nth frame. However, content owners wish to protect each and every video frame. Moreover, if it is known which frames contain embedded signals, it is simple to remove those frames with no noticeable degradation in the video signal.
An alternative option is to insert the watermark into nxc3x97n blocks of the image (subimages) where n less than  less than N. If the block size is chosen to be 8xc3x978, i.e. the same size as that used for MPEG image compression, then it is possible to tightly couple the watermark insertion and extraction procedures to those of the MPEG compression and decompression algorithms. Considerable computational saving can then be achieved since the most expensive computations relate to the calculation of the DCT and its inverse and these steps are already computed as part of the compression and decompression algorithm. The incremental cost of watermarking is then very small, typically less than five percent of the computational requirements associated with MPEG.
U.S. patent application Ser. No. 08/715,953, filed Sep. 19, 1996, entitled xe2x80x9cWatermarking of Image Data Using MPEG/JPEG Coefficientsxe2x80x9d which is incorporated herein by reference, advances this work by using MPEG/JPEG coefficients to encode the image data.
U.S. patent application Ser. No. 08/746,022, filed Nov. 5, 1996, entitled xe2x80x9cDigital Watermarkingxe2x80x9d, which is incorporated herein by references, describes storing watermark information into subimages and extracting watermark information from subimages.
A review of watermarking is found in an article by Cox et al., entitled xe2x80x9cA review of watermarking and the importance of perceptual modelingxe2x80x9d in Proc. of EI""97, vol. 30-16, Feb. 9-14, 1997.
There have been several proposals to watermark MPEG video or JPEG compressed still images. In all cases, each 8xc3x978 DCT block is modified to contain the watermark or a portion thereof. Consequently, decoding of the watermark requires that each 8xc3x978 block be individually analyzed to extract the watermark signal contained therein. The individual extracted signals may then be combined to form a composite watermark, which is then compared with known watermarks. Because each block must be analyzed individually, an uncompressed image must be converted back to the block-based DCT representation, which is computationally expensive. Thus, while the decoder may be computationally efficient in the DCT domain, extracting a watermark from the spatial domain is much more expensive.
To allow for computationally efficient detection of the watermark in both the spatial and DCT domains, a watermark may be inserted in the sum of all the 8xc3x978 blocks in the DCT domain, or the sum of a subset of all the 8xc3x978 blocks in the DCT domain. A major advantage of this approach is that if the image is only available in the spatial domain, then the summation can also be performed in the spatial domain to compute a small set of summed 8xc3x978 blocks and only those blocks must then be transformed into the DCT domain. This is because the sum of the DCT blocks is equal to the DCT of the sum of the intensities. Thus, the computational cost of decoding in the DCT and spatial domains is approximately the same.
A second advantage of watermarking the sum of the DCT blocks is that there are an unlimited number of equivalent methods to apportion the watermark throughout the image. For example, if the watermark requires a change of xcex94i to the i""th coefficient of the summed DCT block, then, if there are M blocks in the image, xcex94i/M can be added to each individual block, or block 1 can have xcex94i added to it and the remaining Mxe2x88x921 blocks left unaltered, ignoring for the moment issues of image fidelity. Because of this one to many mapping, it is possible to alter the insertion algorithm without changing the decoder. This is a very important characteristic, since in some watermarking applications, there may be many hardware decoders that are deployed, such that changing the decoder is impractical. However, improvements to the insertion algorithm can still result in improved detection using the approach described herein.
A third advantage of watermarking the sum of the DCT blocks is that watermark signals extracted from these sums have small variances, compared with the amount that they may be changed without causing fidelity problems. This means that, in many cases, it is possible to change an image so that the summed DCT blocks perfectly match the required watermark signal, even though the resulting image appears identical to the original.
Finally, it is well known that some problems, such as modeling the human visual system, are best performed in the frequency domain, where other problems such as geometric transformations are more conveniently dealt with in the spatial domain. Since the computational cost of decoding the watermark is now symmetric, it is possible to switch from spatial to frequency domains at will in order to correct for various signal transformations that may corrupt the watermark.
The present invention concerns a novel insertion method which employs a specific model of the human visual system which provides much better control over image fidelity. Tests have shown that it is possible to obtain large signals (more than 15 standard deviations from 0 correlation) with images that are indistinguishable from their respective original images.
The method handles robustness against various types of attacks in ways that are easy to relate to the specific type of attack.
The method is adaptable so that the model of the human visual system and the techniques used for handling attacks can be changed later without having to change the detector. The result is that it is possible to continue improving watermarking, particularly DVD (digital video disk) watermarking, even after many detectors have been installed. This is analogous to the situation with MPEG video for which encoder technology can be improved without having to change existing decoders.
Use of the present insertion method allows a simple detection algorithm in either MPEG or decompressed domains.
The invention also concerns a novel detection method which is easy to implement, easy to analyze and has a low computational cost, whether the incoming video is MPEG compressed or uncompressed.
The present invention also concerns a novel insertion method that hides multiple patterns in the data. These patterns fall into two categories: 1) registration patterns used during detection to compensate for translational shifts, and 2) watermark patterns that encode the information content of the watermark.
A principal object of the present invention is the provision of a digital watermark insertion method which allows detection of watermarks after the watermarked data is subjected to predefined scale changes, without modification to the watermark detector.
Another object of the invention is the provision of a watermark detection method that is computationally inexpensive in either the MPEG or decompressed domains.
A still other object of the invention is the provision of a digital watermarking method that withstands attacks without having to change a detector.