1. Field of the Invention
The invention relates to the watermarking of digital representations generally and more specifically to the watermarking of digital representations that have been made using lossy compression techniques.
2. Description of Related Art
Nowadays, the easiest way to work with pictures or sounds is often to make digital representations of them. Once the digital representation is made, anyone with a computer can copy the digital representation without degradation, can manipulate it, and can use the Internet to send the digital representation virtually instantaneously from anywhere in the world to anywhere in the world.
From the point of view of the owners of the digital representations, there is one problem with all of this: pirates, too, have computers, and they can use them to copy, manipulate, and distribute digital representations as easily as the legitimate owners and users can. If the owners and users of the original digital representations are to be protected against illegal copiers or forgers of the digital representations, the digital representations themselves must be protected from pirates and forgers.
One technique that is widely used to make piracy and forgery more difficult is digital watermarking. A digital watermark is a modification of a digital representation so that it contains additional information. The modification is done in such a fashion that the additional information takes the form of noise with regard to the content of the original digital representation. If the noise is added in a way that makes the noise imperceptible when the digital representation is played, displayed, or printed the watermark will remain invisible to those who use the digital representation but can be located and read by those who put the additional information into the digital representation. The additional information can be anything the maker of the watermark chooses, but when watermarks are used to make piracy or forgery more difficult, the additional information is typically ownership or copyright information about the digital representation or information that can be used to authenticate the digital representation or the analog representation that results when the digital representation is played, displayed, or printed. For further information about watermarking, see Jian Zhao, “Look, It's Not There”, in: BYTE Magazine, January, 1997. Detailed discussions of particular techniques for digital watermarking may be found in E. Koch and J. Zhao, “Towards Robust and Hidden Image Copyright Labeling”, in: Proc. Of 1995 IEEE Workshop on Nonlinear Signal and Image Processing, Jun. 20-22, 1995, in U.S. Pat. No. 5,710,834, Rhoads, Method and Apparatus Responsive to a Code Signal Conveyed through a Graphic Image, issued Jan. 20, 1998, and in U.S. Pat. No. 6,359,985, Koch, et al., Technique for marking binary coded data sets, issued Mar. 19, 2002. For examples of commercial watermarking systems that use the digital watermarking techniques disclosed in the Rhoads patent, see Digimarc Corporation's web site. For an example of how digital watermarking may be used to authenticate analog representations, see U.S. Pat. No. 6,243,480, Jian Zhao, et al., Digital authentication with analog documents, issued Jun. 5, 2001.
One class of digital representations which have posed difficulties for digital watermarking is digital representations made using lossy compression techniques. These compression techniques are termed lossy because they reduce the size of a digital representation of an audio signal or video signal by removing information from the digital representation. The information selected for removal is information that can be removed without unacceptable damage to the analog representation produced from the compressed digital representation. In some lossy compression techniques, models of how humans perceive sound or images are used to select the information to be removed. The effect of lossy compression is thus the reverse of that of watermarking: while watermarking adds information to the digital representation by increasing the amount of imperceptible noise in the digital representation, lossy compression reduces the size of the digital representation by removing information from the digital representation which would be imperceptible or nearly so in the analog representation made from the digital representation. Of course, the preferred place to put a watermark in a digital representation is in that part of the digital representation which is imperceptible in the analog representation, and consequently, removal of any digital watermarks that were present in the digital representation prior to compression is often one of the side effects of lossy compression of the digital representation.
FIG. 1 shows how lossy compression is applied to an audio signal in the audio compression scheme used in the MPEG 1 standard for producing compressed digital representations of video transmissions or movies. For details, see K. R. Rao and J. J. Hwang, Techniques and standards for image, video, and audio encoding, Prentice Hall PTR, Upper Saddle River, N.J. 07548, 1996, pp. 242-265. The input to the compression process is a digitized audio signal 103 in the time domain, i.e., the input is a digitized representation of the audio signal as it varies over time. Audio signal 103 goes to filter bank 105 and also to audio perception model 107. The latter is a model of how the human hearer perceives an audio signal. Filter bank 105 windows the time domain samples 103 into groups of short (6) or long (18) sample windows, depending on spectral and temporal properties of the audio signal, and feeds the grouped samples into Modified Discrete Cosine Transform 111. The output of 111 is a set of frequency samples 113, representing one frame of raw audio in the frequency domain. These samples are now ready to be quantized and grouped into subbands for comparison against 32 signal-to-mask ratios produced by audio perception model 107.
The raw sample 113 is then compressed at 119 by quantizing the raw frequency samples for the frame and applying audio perception model 107 to the quantized raw samples. With the help of audio perception model 107, bit noise allocation and quantization process 119 minimizes the number of bits needed to represent the audio signal contained in the frame while keeping the distortion at minimum. The frame that results from this process is output at 120 to decision block 121, which determines whether the bit rate of the frame is low enough and its quality high enough to meet the standard for the compression process. If the frame passes, it is encoded and formatted at 127 as required for the MPEG-1 audio bit stream 129; if not, loop 123 returns the frame to allocation and quantization stage 110 and the audio perception model is again applied to it.
As is apparent from the above description, MPEG compression 101 will tend to destroy any watermark which has been applied prior to the compression process to digitized time domain audio signal 103 or to a digitized frequency domain audio signal. Moreover, any watermark that is applied during the compression process must take perception masking model 107 into account, since model 107 will result in the elimination of the imperceptible noise that usually carries the watermark. It is thus an object of the invention to provide a technique for watermarking digital representations during a lossy compression process which is compatible with the use of perception model 107 in the compression process.