Digital watermarking is a process for modifying physical or electronic media to embed a machine-readable code into the media. The media may be modified such that the embedded code is imperceptible or nearly imperceptible to the user, yet may be detected through an automated detection process. Most commonly, digital watermarking is applied to media signals such as images, audio signals, and video signals. However, it may also be applied to other types of media objects, including documents (e.g., through line, word or character shifting), software, multi-dimensional graphics models, and surface textures of objects.
Digital watermarking systems typically have two primary components: an encoder that embeds the watermark in a host media signal, and a decoder that detects and reads the embedded watermark from a signal suspected of containing a watermark (a suspect signal). The encoder embeds a watermark by altering the host media signal. The reading component analyzes a suspect signal to detect whether a watermark is present. In applications where the watermark encodes information, the reader extracts this information from the detected watermark.
Several particular watermarking techniques have been developed. The reader is presumed to be familiar with the literature in this field. Particular techniques for embedding and detecting imperceptible watermarks in media signals are detailed in the assignee's U.S. Pat. Nos. 6,614,914 and 5,862,260, which are hereby incorporated by reference.
This document describes methods and systems for time-frequency domain watermarking of media signals, such as audio and video signals. One of these methods divides the media signal into segments, transforms each segment into a time-frequency spectrogram, and computes a time-frequency domain watermark signal based on the time frequency spectrogram. It then combines the time-frequency domain watermark signal with the media signal to produce a watermarked media signal. To embed a message using this method, one may use peak modulation, pseudorandom noise modulation, statistical feature modulation, etc. Watermarking in the time-frequency domain enables the encoder to perceptually model time and frequency attributes of the media signal simultaneously.
Another watermark encoding method divides at least a portion of the media signal into segments and processes each segment as follows. It moves a window along the media signal in the segment and repeatedly applies a frequency transform to the media signal in each window to generate a time-frequency representation. It computes a perceptually adaptive watermark in the time-frequency domain, converts the watermark signal to the time domain using an inverse frequency transform and repeats the process until each segment has been processed. Finally, it adds the watermark signal to the media signal to generate a watermarked media signal.
A method for decoding the watermark from the media signal transforms the media signal to a time frequency representation, computes elements of a message signal embedded into the media signal from the time frequency representation, and decodes a message from the elements. The elements may be message signal elements of an antipodal, pseudorandom noise based watermark, or message signal elements of some other type of watermark signal, such as statistical feature modulation signal, peak modulation signal, echo modulation signal, etc.
One embodiment of a watermark decoder includes a detector for determining whether a watermark is present in the media signal and determining an alignment and scale of the watermark. It also includes a reader for decoding an auxiliary message embedded in a time frequency representation of the media signal.
One aspect of the invention is a method of watermarking an audio signal. The method performs frequency transformations of blocks of audio to produce frequency domain representations of the blocks. The method then forms a two dimensional representation of the audio from the frequency domain representations. This is sometimes referred to as a time frequency representation or spectrogram of the audio. The method provides an auxiliary data signal to be embedded in the audio signal. Finally, the method modifies the two dimensional representation of the audio according to the auxiliary data signal to embed the auxiliary data signal in the audio signal. The modifications can be computed in one domain and then adapted for application to the audio signal in another domain, such as a frequency domain, on a compressed bit stream, or in an un-compressed, time domain version of the audio signal.
Variants of the method embed the auxiliary signal be introducing modifications in the two dimensional representation that correspond to auxiliary data symbols. To enhance robustness, symbols are encoded redundantly in different frequency bands, sometimes using different embedding functions. In some variants, the modifications are adapted to the signal in the two dimensional representation. For example, one embodiment modulates peaks, while other embodiments modulate other features or statistics to correspond to embedded data.
A watermark detector method decodes the auxiliary data signal from an audio signal. The method performs frequency transformations of blocks of audio to produce frequency domain representations of the blocks, and forms a two dimensional representation of the audio from the frequency domain representations. The method analyzes the two dimensional representation of the audio signal to ascertain modifications made to encode the auxiliary data signal, and reads the auxiliary data signal from the modifications.
Further features and advantages will become apparent from the following detailed description and accompanying drawings.