The present invention relates to methods and systems for “digital watermarking” of multimedia signals, that is, methods and systems for encoding information in multimedia signals that may be used to verify authenticity or otherwise add information to the signals. In particular, the present invention is directed to a digital watermarking method and system that utilizes an inverse difference pyramid decomposition.
The art includes a variety of approaches to digital watermarking for multimedia signals, including audio and video signals and still images. U.S. Pat. Nos. 5,404,377 and 5,473,631 to Moses disclose various systems for imperceptibly embedding data into audio signals, particularly focusing on neural network implementations and perceptual coding details. U.S. Pat. No. 5,574,962 to Fardeau et al. teaches a method for identifying a program including a sound signal, where the method is based on adding an inaudible encoded digital data in predefined frequencies. U.S. Pat. No. 5,450,490 to Jensen et al. teaches an apparatus and method for encoding and decoding audio signals, where the code is included in at least one frequency component of the processed audio signal. The frequency is selected using the HAS psycho-acoustic model. U.S. Pat. No. 5,905,800 to Moskowitz et al. teaches a method for applying a digital watermark to a content signal using a watermarking key. The watermarking key includes a binary sequence and information describing the application of that binary sequence to the content signal. The digital watermark is then encoded within the content signal at one or more locations determined by the watermarking key. European Patent No. EP0581317 discloses a system for redundantly marking images with multi-bit identification codes. Each bit of the code is manifested as a slight increase or decrease in pixel values around a plurality of spaced apart “signature points.” Decoding proceeds by computing a difference between a suspect image and the original image, and checking for pixel perturbations around the signature points.
There are various consortium research efforts underway in Europe on copyright marking of video and multimedia. A survey of techniques is found in “Access Control and Copyright Protection for Images (ACCOPI), WorkPackage 8: Watermarking,” June, 1995, which is incorporated herein by reference. A new project, termed TALISMAN, appears to extend certain of the ACCOPI work. Zhao and Koch, researchers active in these projects, provide a Web-based electronic media marking service known as Syscop. In addition, Highwater FBI, Ltd., of Great Britain, has introduced a software product that is believed to imperceptibly embed identifying information into photographs and other graphical images. This technology is the subject of PCT publication WO 95/20291.
U.S. Patent Application Publication No. 20040022444 is directed to a method and apparatus for identifying an object by encoding physical attributes of the object where the encoded information is utilized as at least one element for composing a digital watermark for the object. In a disclosed embodiment, the physical attributes of the object are utilized as a key for accessing information included in a digital watermark for the object.
U.S. Pat. No. 6,078,664 to Moskowitz et al. teaches that Z-transform calculations may be used to encode and decode carrier signal independent data (e.g., digital watermarks) to a digital sample stream. Deterministic and non-deterministic components of a digital sample stream signal may be analyzed for the purposes of watermark encoding. The watermark may be encoded in a manner such that it is concentrated primarily in the non-deterministic signal components of the carrier signal. The signal components can include a discrete series of digital samples and/or a discreet series of carrier frequency sub-bands of the carrier signal. Z-transform calculations may be used to measure the desirability of particular locations and a sample stream in which to encode the watermarks.
U.S. Pat. No. 6,205,249 to Moskowitz teaches multiple transform utilization and applications for secure digital watermarking. Digital blocks in digital information to be protected are transformed into the frequency domain using a fast Fourier transform. A plurality of frequencies and associated amplitudes are identified for each of the transformed digital blocks and a subset of the identified amplitudes is selected for each of the digital blocks using a primary mask from a key. Message information is selected from a message using a transformation table generated with a convolution mask. The chosen message information is encoded into each of the transformed digital blocks by altering the selected amplitudes based on the selected message information.
U.S. Pat. No. 5,889,868 to Moskowitz et al. teaches that digital watermarks may be optimally suited to particular transmission, distribution and storage mediums. Watermark application parameters can also be adapted to the individual characteristics of a given digital sample stream. Watermark information can be either carried in individual samples or in relationships between multiple samples, for example, using the waveform shape. The highest quality of a given content signal may be maintained as it is mastered, with the watermark suitably hidden, taking into account usage of digital filters and error correction. The quality of the underlying content signals may be used to identify and highlight advantageous locations for the insertion of digital watermarks. The watermark is integrated as closely as possible to the content signal, at a maximum level to force degradation of the content signal when attempts are made to remove the watermarks.
U.S. Pat. No. 5,687,236 to Moskowitz et al. teaches an apparatus and method for encoding and decoding additional information into a stream of digitized samples in an integral manner, using spatial keys. The information is contained in the samples, not appended to the sample stream. The method does not cause a significant degradation to the sample stream. The method is used to establish ownership of copyrighted digital multimedia content and to provide a disincentive to piracy of such material.
U.S. Patent Application Publication No. 20030200439 to Moskowitz teaches a method and system for transmitting streams of data. The method comprises the steps of receiving a stream of data; organizing the stream of data into a plurality of packets; generating a packet watermark associated with the stream of data; combining the packet watermark with each of the plurality of packets to form watermarked packets; and transmitting at least one of the watermarked packets across a network. The system may utilize computer code to generate a bandwidth rights certificate that may include at least one cryptographic credential; routing information for the transmission; and, optionally, a digital signature of a certificate owner; a unique identification code of a certificate owner; a certificate validity period; and pricing information for use of bandwidth.
U.S. Pat. No. 6,674,876 to Hannigan et al. teaches a method and system for time-frequency domain watermarking of media signals, such as audio and video signals. An encoding method divides the media signal into segments, transforms each segment into a time-frequency representation, and computes a time-frequency domain watermark signal based on the time frequency representation. The method then combines the time-frequency domain watermark signal with the media signal to produce a watermarked media signal. To embed a message using this method, one may use peak modulation, pseudorandom noise modulation, statistical feature modulation, and the like.
A review of the literature reveals that the various known digital watermarking techniques may be categorized as either spatial domain techniques or frequency domain techniques. The spatial domain techniques include least significant bit (LSB) substitution and a correlation-based approach. There are many variants of LSB substitution. This technique, however, essentially involves embedding the watermark by replacing the least significant bit of the image data with a bit of the watermark data. Variations of this technique may also involve other approaches such as converting the watermark sequence into a pseudo-random noise (PN) sequence, which is then embedded into the image, or repeated embedding of the watermark when the watermark is much smaller than the host image. Detection can be performed visually or using correlation methods. In the correlation-based approach, the watermark is converted into a long PN sequence, which is then weighted and added to the host image with some gain factor.
One of the frequency-domain techniques is the Discrete Cosine Transform (DCT) approach. The DCT is a real-domain transform, which represents the entire image as coefficients of different cosine frequencies (which are the basis vectors for this transform). The DCT of the image is calculated by taking eight-by-eight blocks of the image, which are then transformed individually. The two-dimensional DCT of an image gives the result matrix such that the top left corner represents the lowest frequency coefficient while the bottom right corner is the highest frequency coefficient. The DCT technique forms the basis of the Joint Photographic Experts Group (JPEG) image compression algorithm, which is one of the most widely used image data storage formats. The DCT approaches are able to withstand some forms of attack very well such as low-pass/high-pass filtering and median filtering. Mid-band coefficient exchange is a simple DCT variant in which the coefficients of data blocks are exchanged with identical quantization levels as per the standard JPEG color quantization table, so that one coefficient, say (4,1), is greater than the other coefficient, say (3,2), if the bit is “1,” and less if the bit is “0.” Another DCT variant, even-odd quantization, attempts to quantize the obtained DCT results and change them all to even numbers in the case that the bit to be encoded is “0,” and to odd numbers if it is “1.” An advantage of this approach is that there is negligible visual change in the image. Still another DCT variant is Differential Energy Watermarking (DEW), which involves altering the energy levels of two DCT block groups so that EA<EB if the bit is “1.” Before the alteration to the energy levels is made, the DCT blocks are randomly shuffled and then these pairs of A-B blocks are randomly selected in the image, which adds to the security of the data. Yet another DCT approach is CDMA, which involves the insertion of data of length greater than needed to send the information optimally. The technique involves the generation of a pseudo-random sequence based upon a key, and embedding is carried out according to the watermark message.
Another set of frequency domain approaches are the wavelet-based techniques. These techniques involve the embedding of information in the LH (low-high) blocks of the wavelet transform of the image. Changes to these regions are not noticed by observers due to characteristics of the Human Visual System (HVS). These are also utilized for fragile watermarking which is a significant tool for content authentication.
Still another frequency domain approach is the FFT-based technique. In this technique, the watermark is added to the image as a band-limited signal in a circular pattern around the center (DC) frequency. This makes this approach rotationally resilient. This approach is also called the Circular Symmetric Watermarking Technique.
Yet another frequency domain approach is the Fourier-Mellin transform technique. This relatively new technique has arisen out of the need for watermarking techniques that are Rotation, Scale and Translation invariant (RST-invariant). This approach involves creating a Log Polar map of the FFT of the image and embedding information in the FFT of the Log Polar Map. This method is said to be extremely RST invariant and uses an RST invariant watermark.
Finally, another frequency domain approach is phase modulation. In this technique, the phases of pre-selected complex-conjugated coefficients of the orthogonal DFT (Discrete Fourier Transform) or UCHT (Unified Complex Hadamard Transform) are modulated with the watermark information. Since the phase modulation approach is more resilient than the amplitude approach against noises and fraud attempts, this approach has certain advantages, based on changing the amplitudes of selected spectrum coefficients. The DFT has higher computational complexity than UCHT and lower resistance against changes in the spectrum coefficients phases, due to the noises in the communication channel. Also, half of the coefficients in every row of the UCHT matrix are complex. As a result, all of the coefficients of the discrete spectrum are complex as well. For this reason, the computational complexity of UCHT is higher than, for example, the Hadamard transform based on a real matrix, consisting of elements with values +1 and −1 only. A watermarking system has been developed based on two-dimensional UCHT, in which the watermark elements are embedded in the phases of randomly selected spectrum coefficients of the transformed blocks with size eight-by-eight pixels, positioned in the LL (low-low) frequency band of the classic (non-inverse) image pyramid.
A new type of decomposition, the Inverse Difference Pyramid, has been applied to digital image encoding and compression. Published International Application No. WO 01/10130, incorporated herein by reference, describes this technique. The image is approximated with a polynomial function whose coefficients are obtained with regression analysis or obtained with an inverse orthogonal transform of the input image after retaining only a few of its low-frequency coefficients. These coefficients represent the “zero” (top) level of the pyramid. The next pyramid level is obtained when its approximation, defined with the coefficients from the “zero” level, is subtracted from the input image. The resulting difference image is divided into four sub-images with the same size and form, and after processing in a similar fashion the resulting approximated images are obtained. A recursive image decomposition algorithm is employed, which does not require interpolation.