1. Field of Invention
This invention generally relates to systems and methods for hiding information in audio and image files.
2. Description of Related Art
With the advent of digitizing images, digital image distribution and digital video availability, “hiding” information in digital images for purposes such as digital rights management and copyright protection has become a substantial issue for image publishers and authors. The process of imbedding information in a digital image is known as “watermarking”. Such watermarks must be secure, robust to intentional corruption and to data compression processing, not unreasonably complex to embed and extract, and compatible and interoperable with conventional image processing systems. The watermark is generally invisible to a viewer. However, in some applications, it is desirable to produce a visible watermark that can be removed by an authorized image decoder and that can not be removed by an unauthorized decoder.
Although watermarks are used in most cases with respect to digital images, watermarking techniques can also be applied to audio files. Like conventional image watermarking techniques, conventional audio watermarking techniques can be classified into data-domain methods and frequency-domain methods. Data-domain methods work by modifying the actual audio data, such as modulating the least significant bit of a PCM representation or hiding data in compressed-domain representations. Frequency-domain methods work by modifying the spectral content of a signal, for example, by removing a particular frequency component, or by adding information disguised in low-amplitude noise.
Data-domain watermarking techniques include compressed domain watermarking, bit dithering, amplitude modulation and echo hiding. In compressed-domain watermarking, only the compressed representation of the data is watermarked, and is thus not persistent. When the data is uncompressed, the watermark is not available. In least-significant-bit (LSB) modulation information is encoded by modulating the least significant bits of the time-domain or data-compressed representation. While this potentially has a large data rate, it is not robust to data compression or analog transmission and reproduction, and introduces noise into the signal.
In amplitude modulation, signal peaks are modified to fall within predetermined amplitude bands. This technique introduces modulation distortion, and is not robust to amplitude compression, which is widely used in analog and digital telephony, broadcasting, sound reinforcement, and noise reduction. In echo hiding, discrete copies of the original signal are mixed in with the original signal. The echo time is short enough and the copy amplitude is low enough to be inaudible, yet the echo can be detected via autocorrelation. This method introduces spectral distortion because of phase cancellation at frequencies whose periods are multiples of the echo delay. Also, this technique may not be robust under data compression, as imperceptible echoes are likely to be discarded by perceptual coding.
Frequency-domain watermarking techniques include phase coding, frequency band modification, and spread spectrum techniques. Phase coding relies on the human auditory system's relative insensitivity to phase. The signal is windowed, as in a spectrogram, and the magnitude and phase of each window is computed. An artificial absolute phase signal, which encodes the watermark, is introduced into the first window. The phase information for subsequent frames is iteratively computed from the phase differences from each frame and the absolute phase. The resulting phases are combined with the original magnitudes to construct the watermarked signal. This method introduces phase dispersion into the signal, and is probably not robust under data compression.
In frequency band modification, information is encoded by removing or enhancing particular spectral bands, removing a narrow spectral band using a notch filter, or encoded into frequency band differences. This method introduces spectral distortion, may not be robust to perceptual encoding, and does not work unless the altered frequency components are well-represented in the source audio.
In spread spectrum techniques, a signal carrying the watermark information is modulated into wideband noise by multiplication with a pseudorandom sequence. Because the modulation function is known, or can be regenerated, the watermark signal can be demodulated. This technique adds noise to the watermarked signal, and the low amplitude of the spread spectrum signal means it may be likely to be discarded under perceptual coding. In addition, the sampling frequency is commonly used as the modulation carrier frequency to avoid having to synchronize the receiver. In this case, re-sampling or analog transmission is likely to destroy the synchronization, and hence the watermark.
Many schemes, particularly modulation and frequency domain approaches, are not robust to audio data compression. This is especially problematic, as the frequency modifications must be perceptually inaudible in the watermarked audio data. Otherwise, the watermark is not good. However, such conventional frequency modulations are precisely the information that is lost or altered when perceptual data compression schemes such as MP3 are used.
There has also been considerable work in watermarking images. Most approaches are quite similar to those described above. For example, spread spectrum techniques can be used for images as well as audio. One relevant conventional approach for watermarking text modulates white space between words and sentences. This method needs to detect word boundaries, and is not applicable to common images other than scanned text. The Glyph technology developed at Xerox PARC encodes information into digital hardcopy using tiny marks that can be modulated to encode information in addition to gray shades. U.S. Pat. No. 5,946,103 to Curry discloses a method that uses glyphs to digitally watermark a printed document. However, glyph technology typically generates images with noticeable structures. This makes this method suitable only for specific applications. The “Patchwork” watermarking system alters the intensity of random pairs of points in the image. A method called texture block coding encodes information by copying areas of random texture. These areas can be found by autocorrelation.