The present invention relates to watermarking of digital data and in particular to the technique of embedding watermark data into digital image data and recapturing the embedded watermark data from a noisy version of the watermarked digital image without the use of the original digital image data.
The fast development of digital information technology, combined with the simplicity of duplication and distribution of digital data across communication networks like the Internet (using, e.g., publicly accessible web sites), has exposed content providers to a real challenge of how to protect their electronic data (e.g., images, video, sound, text). This has stimulated many research efforts towards the design and study of sophisticated watermarking and information hiding methodologies for copyright protection.
Watermarking techniques are used to embed secret information into, for instance, an image in such a way that this information cannot be removed or deciphered without access to a secret key. On the other hand, to maintain quality, the watermarked image needs to be perceptually identical once the watermark is embedded into the original image.
The main application of watermarking is for proving ownership of the data and for protection against forgers. It is therefore crucial to ensure that no malicious attacker is able to remove the watermark without damaging the image to the degree that it becomes useless. In addition watermarking techniques need to be robust to standard operations performed on images, like printing, scanning, lossy compression (e.g., JPEG), filtering, and so on.
Hence, the tradeoffs to be considered in the design of a watermarking technique rely on three principal parameters: robustness to attacks and image processing operations, high quality, and the relative amount of secret information to be embedded, i.e., the coding rate.
Prior art image watermarking techniques can be divided into two groups according to the type of feature set the watermark is embedded in. Specifically, the watermark is embedded into either the intensity value of the luminance in the spatial domain representation of the image or the transform coefficients in the transform domain representation (e.g. DCT, DWT) of the image. The algorithms that are used to detect the watermark signal can also be placed under two categories based on whether or not they use the original image during the watermark detection process. In general, watermarking techniques that embed the watermark in the coefficients of a transform domain and detect it without resorting to the original image enjoy many advantages in their robustness to standard image processing operations and to hostile attacks.
The basic technique for embedding a watermark message or data into an image I by encoding the watermark data within the transform domain is performed by initially transforming the image I to obtain a representation in a transform domain (e.g. DCT, DFT etc) of the image. Next, a sub-set of the transform coefficients is selected and the watermark message is encoded by slightly changing the sub-set of transform coefficients values, thus producing a watermarked transform coefficient set. The watermarked coefficients are then combined with the non-watermarked coefficients to generate a watermark embedded transform representation of the image. The watermarked transform representation is then inverse-transformed to produce a watermarked image Ĩ. The coefficient modification is subtle enough so that Ĩ is perceptually indistinguishable from the original image I.
In order to recapture the watermark from the watermarked image Ĩ a decoder receives the distorted or noisy version of Ĩ, that may have suffered certain image processing operations or hostile attacks. It transforms the received version to the appropriate transform domain representation and selects the same sub-set of coefficients in which the watermark signal has been encoded. The decoder then extracts the watermark message from the sub-set of coefficients using a decoding procedure.
Recently, two methods for watermarking in a transform domain have been suggested for decoding without resorting to the original image: coefficient perturbation and dithering modulation.
In one prior art method using coefficient perturbation, the watermark data is added to the original image by perturbing the values of significant DCT (i.e., transform) coefficients. For example, if s is a DCT coefficient, a xe2x80x9czeroxe2x80x9d bit of the watermark data is encoded by changing the coefficient to s+xcex5 and a xe2x80x9conexe2x80x9d bit of the watermark data is encoded by changing the coefficient to sxe2x88x92xcex5, where xcex5 is a small constant. Recapturing the watermark message from the watermarked image Ĩ is accomplished by correlating the appropriate DCT coefficients with the watermark message. Variations of this watermarking technique have been suggested such as the use of different transform domains such as DFT or DWT and/or the use of different perturbation schemes (e.g., s(1xc2x1xcex5) or sxc2x1xcex5|s|).
Dithering modulation is based on quantizing the transform domain coefficients. In this case, embedding a watermark message in a selected sub-set of coefficients is based on replacing these coefficients with their quantized values in a way that depends on the watermark message. For example, a xe2x80x9czeroxe2x80x9d bit is encoded by quantizing the coefficients with a quantizer q0 and a xe2x80x9conexe2x80x9d bit is encoded by quantizing the coefficients with a different quantizer q1.
In a variation of the dither modulation watermarking scheme xe2x80x9czeroxe2x80x9d bit and xe2x80x9conexe2x80x9d bit values are encoded using two xe2x80x9cself-noise suppressionxe2x80x9d mappings f0 and f1 which are based on a small modification of the quatizer mappings q0 and q1 used in the dither modulation scheme.
To date, the prior art watermarking schemes including coefficient perturbation and dither modulation are not explicitly designed to adapt to a predefined distortion level or noise criteria. As a result, they provide a relatively low rate of information embedding or suffer from high error rate under changing noise conditions. More importantly, since attempts by malicious attackers take the form of unknown noise conditions on the watermark embedded image data, these prior art schemes are less able to protect against attacks upon the security of the dataxe2x80x94one of the main purposes of watermarking.
The present invention presents a watermarking technique which is based on a new encoding scheme referred to as scaled bin encoding which encodes watermark data into image data by modifying image values in a way that preserves high image quality (i.e., low distortion levels) and adapts to expected (i.e., worse case) noise level. Recapturing of the watermark data from the watermark embedded image after it has been exposed to unintentional and/or intentional noise is performed via a decoding method using a probability based procedure (e.g., maximum likelihood decoding), based on estimated statistics of the original image values and an expected statistical model of the noise introduced to the image by image processing operations or attack noise, thereby providing a robust and high quality watermarking system and method.
The present invention provides a method and system of embedding digital watermark data into digital image data and recapturing the embedded watermark from the watermarked digital data without the use of the original digital image data and despite noise introduced into the watermarked digital image data. The system and method of the present invention is performed by two processes: the process of embedding the watermark into the original digital data to obtain watermarked digital data and the process of decoding the watermarked digital data to obtain the original watermarked data. Watermark data is embedded into the digital image data using an encoding method referred to as scaled bin encoding. The embedded watermark data is recaptured from the watermarked data using a probability based decoding scheme. In one embodiment, the probability based decoding scheme is the Maximum Likelihood (ML) decoding scheme.
In one embodiment, embedding of the watermarked data is achieved by transforming the original digital image data into first transform coefficients; encoding the watermark data using an error correcting code; embedding the encoded watermark data into a first sub-set of the first transform coefficients to generate a sub-set of embedded transform coefficients using scaled bin encoding; and then inversely transforming the watermark embedded first sub-set of coefficients along with the remaining non-watermark embedded first transform coefficients to generate watermark embedded digital image data which includes the original digital image data and the watermark data
According to this embodiment, scaled bin encoding is performed by scaling each coefficient of the sub-set of transform coefficients with a predetermined scaling parameter which is representative of an expected noise level and an allowed distortion model; mapping each scaled coefficient to one of a pair of skewed discrete mappings dependent on the logic state of the corresponding encoded watermark data bit to be embedded into each scaled coefficient; obtaining a difference between each scaled coefficient from its corresponding mapped and scaled coefficient; and adding the difference to its corresponding original (i.e., unscaled and unmapped) coefficient to obtain each watermark embedded transform coefficient.
In accordance with another embodiment of the system and method of the present invention, the watermark data is recaptured from the watermark embedded digital image data by transforming the watermark embedded image into second transform coefficients using the same transformation as used during watermark encoding; selecting a second sub-set of watermarked transform coefficients which correspond to the first sub-set of transform coefficients; estimating statistical parameters of the first sub-set of transform coefficients using image statistics of the second watermarked sub-set of coefficients; extracting embedded watermark data from the second sub-set of coefficients with a probability based decoding scheme which uses the predetermined scaling parameter, known aspects of the scaling and mapping steps, the expected noise and allowed distortion model, and the estimated statistical parameters; decoding the extracted watermark data using an error correcting decoder based on the error correcting code used when watermark encoding to obtain an estimate of the original watermark. In one embodiment the ML decoding scheme is used as the probability type decoding scheme.
For embedding the watermark into the digital image data, one embodiment of the system of watermarking includes an image transformer for transforming the digital image data into a transform domain representation; an encoder for encoding watermark data using an error correcting code to generate encoded watermark data; a watermark encoder for embedding the encoded watermark data into a first sub-set of the first transform coefficients to generate a sub-set of embedded transform coefficients by using scaled bin encoding; an inverse image transformer for inversely transforming the watermark embedded first sub-set of coefficients along with the remaining non-watermark embedded first transform coefficients to generate watermarked digital image data which includes the digital image data and the watermark data.
For recapturing the watermark data from the watermarked image, one embodiment of the system of watermarking includes an image transformer for transforming the watermarked image data into the transform domain representation; a means for determining the statistical parameters of the second sub-set of coefficients; a probability based watermark decoder for extracting embedded watermark data from the second sub-set of coefficients with a probability based decoding scheme which uses the predetermined scaling parameter, known aspects of the scaling and mapping steps, the expected noise and allowed distortion model, and the estimated statistical parameters; a decoder for decoding the extracted watermark data using an error correcting decoder based on the original error correcting code to generate the watermark data.