A black-white or grey scale image of size X.times.Y can be described by its brightness in a plane, i.e., by a function f(x,y) where f is the brightness at a point with coordinates (x,y). A digital grey scale image comprises pixels arranged in a raster. Such an image can be described by f(i,j) where the brightness f takes on a value from a discrete set of values called quantized values, and (i,j) is a pair of integers. An example is a digital image of 512.times.512 pixels and each pixel takes on an integer value between 0 and 255. In this specification, an image shall mean a digital image unless otherwise noted.
A common way to represent the pixels of a digital color image is to use 3 numbers denoting the red component, green component, and blue component. In this way, the image is represented in the R-G-B color coordinate system. Another way to describe a color pixel is to use the luminance and 2 chrominance components, where the luminance corresponds to the brightness. There are many other color coordinate systems.
The process of representing an image by a stream of `ones` and `zeros`, i.e. using bits, is commonly referred to as image coding. That is, an image is converted to a binary stream by image coding. Decoding refers to the process of obtaining the image from the binary stream. Image compression refers to the process of reducing the number of bits to represent a given image. In many image coding methods, it is desirable to use the fewest number of bits to represent the image. For this reason, image coding and image compression are often used synonymously.
Image coding or compression can be lossless or lossy. A lossless coding method produces a binary stream from which the original image can be obtained exactly. A lossy coding method produces a binary stream but the decoded image, called the compressed image, is not exactly the same as the original image. In lossy coding, the compressed image can look indistinguishable from the original, or it can look different from the original, in which case the difference shows up as artifacts.
Instead of being described by its pixel values, a digital image can also be described in the `frequency domain` or more generally the `transform domain`. The N.times.M values of f is transformed to a set of numbers called transform coefficients, usually also N.times.M in number. Commonly used transformations include the Discrete Cosine Transform (DCT), Discrete Fourier Transform (DFT), and Wavelet Transform.
Current computer and information technology allow easy editing and perfect reproduction of digital images, which in turn can lead to problems with copyright protection, ownership verification and authentication. Such problems are addressed by digital watermarking, which concerns processes that embed or insert data into a multimedia data object. The inserted data are often called digital watermarks. Depending on the application, digital watermarking may be applied to different types of data, including digital still images, digital audio and digital video. For images, a visible watermark is one that is intentionally made to be noticeable to a human observer whereas an invisible watermark is one that is not perceptible to a human but may be extracted by a computer or other electronic means. Whether visible or invisible watermarking is employed depends upon the particular application. The following references may be consulted for further background on digital watermarking:
[1] Mintzer, et al., "Effective and Ineffective Digital Watermarks," IEEF-ICIP, 1997.
[2] Friedman, "The Trustworthy Digital Camera: Restoring Credibility to the Photographic Image," IEEE Trans. on Consumer Electronics, November 1993.
[3] Schneider, et al., "A Robust Content Based Digital Signature for Image Authentication," IEEE-ICIP, 1996.
[4] Storck "A New Approach to Integrity of Digital Images," IFIP Conf. on Mobile Communication, 1996.
[5] Yeung, et al., "An Invisible Watermarking Technique for Image Verification," IEEE-ICIP, 1997.
[6] Swanson, et al., "Robust Data Hiding for Images,= IEEE DSP Workshop, 1996.
[7] Koch, et al., "Towards Robust and Hidden Image Copyright Labeling," IEEE Workshop on Nonlinear Signal and Image Processing, 1995.
[8] Zeng, et al., "On Resolving Rightful Ownership of Digital Images by Invisible Watermarks," IEEE-ICIP, 1997.
One application of digital watermarking is in the field of digital photography, in which images are captured with a digital camera or photographs are digitized. In these cases, it would be advantageous to embed an invisible watermark in the image at the time of capture or digitizing. This watermark could be used later (e.g., in a court of law) to verify that the image is authentic, i.e., has not been altered.
It is important that a method of waterinarking for authentication can:
(1) permit the user to determine whether an image has been altered or not; PA1 (2) identify where in the image such alteration occurred; and PA1 (3) allow the watermarked image stored in a lossy-compression format (such as JPEG). PA1 (4) to integrate the watermark with the host image rather than as a separate data file; and PA1 (5) to have the watermark invisible under normal viewing conditions.
In addition, it is highly desirable
Previously known methods for image authentication do not have all of the above capabilities. The digital signature methods (e.g., above cited references [2] [3] [4]) do not have capabilities 2 and 3; the pixel-domain watermarking methods (e.g., above cited reference [5]) do not have capability 3; the frequency-domain data hiding schemes (e.g., above cited [6] [7]) cannot always localize alterations and may introduce excessive distortion. Since the present invention and the pixel-domain method [5] have certain similarity, that method will be briefly reviewed and the difference pointed out.
The method presented in [5] embeds a watermark, which is a binary pattern, in the pixel domain, using a look-up table (LUT). The method is illustrated in FIG. 1, where the (unmarked) grey scale image consists of a block of 8.times.8 pixels whose values are shown and the pattern to be embedded is the letter "I", also formed with a block of 8.times.8 pixels. Suppose the black pixels of the pattern correspond to "0" and the white pixels correspond to "1". These "1"s and "0"s are called marking values. As shown in FIG. 1, the top row of the LUT are the luminance values of the unmarked image and the bottom row are binary, i.e., "1" or "0". The 4th number in the first row of the image has a value of 153 as shown; the binary pattern corresponding to this pixel is black and therefore has the value "0". From the LUT, 153 corresponds to a "0", agreeing with the pixel value of the pattern. So the number 153 is unchanged in the marked image, i.e., the 4th pixel in the first row of the marked image has the value 153. The first number in the second row of the image has the value 144. The corresponding binary pattern is white and therefore has the value "1". But from the LUT, 144 corresponds to a "0". So the number 144 is changed to 143 for which the corresponding value in the table is "1 ". Thus, the first pixel in the second row of the marked image has the value 143. All pixels in the original image are processed in this manner. That is, if the luminance of a pixel in the original image does not map to the value in the corresponding binary pattern by the LUT, the luminance value is changed to a new value which is close to the original value and which corresponds to a binary value that agrees with the binary pattern.
The marked image is made up of an 8.times.8 block of pixels from which the watermark is easily extracted by referring to the LUT. The fourth pixel in the first row of the marked image is 153, for which the corresponding binary value from the LUT is "0". So the fourth pixel in the first row of the extracted pattern is black. Similarly, the first pixel in the second row of the marked image is 143, for which the corresponding binary value from the LUT is "1". So the first pixel in the second row of the extracted pattern is white. When all the pixels of the marked image have been processed in this manner, a pattern of "I" will have been extracted.
If a pixel in the marked image is changed, the changed value will be mapped to either "1" or "0", each with a probability of 0.5. So there is a 0.5 probability that the extracted watermark for that pixel will be different from the corresponding pixel in the original pattern. Such a possible change in a single pixel may or may not be observable by a viewer. However, if a group of neighboring pixels is changed, the probability that the corresponding part of the extracted watermark will look different is significantly increased. Images are often lossily compressed to save transmission time and storage. If an image is watermarked using the method just described, and if the marked image is then lossily compressed, the watermark inserted in the image will be changed due to the compression process. Therefore, the watermarked image from such an approach cannot be lossily compressed without adversely affecting the watermark. Another disadvantage of the above method is due to the fact that a human viewer can easily notice the changes in the pixel values in the smooth regions of the image due to the embedding process, making it difficult to insert the watermark in the smooth regions of an image.
As described in greater detail below, the present invention inserts a watermark in the quantized transform domain coefficients using a lookup table. This allows the watermarked image to be stored in compressed form. The present invention can also insert the pattern in the smooth regions of an image and can embed content based features in addition to a pattern, so the reliability of detecting alterations of the image is significantly increased. In addition, a shuffling scheme of the present invention can also be applied to embedding a watermark in the pixel domain of smooth regions.
The result achieved by present invention is illustrated with FIGS. 2A, 2B and 2C. FIG. 2A is a JPEG compressed image, into which we embed a watermark of the pattern "PUEE" (shown in FIG. 2B) using the method of the present invention. The watermarked image, shown in FIG. 2B, is indistinguishable from the unmarked original, FIG. 2A. Two modifications are then made of the image. "Princeton University" on the top right corner is changed to "Alexander Hall" and "Copyright 1997" in the lower left corner is changed to "January 1998", as illustrated in FIG. 2C. Also shown in FIG. 2C is that the watermark is extracted from this modified image, which clearly shows where modifications have taken place.