The invention relates to multimedia processing, and more specifically relates to detecting embedded code signals in media such as images, video and audio.
Digital watermarking is a process for modifying media content to embed a machine-readable code into the data content. The data may be modified such that the embedded code is imperceptible or nearly imperceptible to the user, yet may be detected through an automated detection process. Most commonly, digital watermarking is applied to media such as images, audio signals, and video signals. However, it may also be applied to other types of data, including documents (e.g., through line, word or character shifting), software, multi-dimensional graphics models, and surface textures of objects.
Digital watermarking systems have two primary components: an embedding component that embeds the watermark in the media content, and a reading component that detects and reads the embedded watermark. The embedding component embeds a watermark pattern by altering data samples of the media content in the spatial or frequency domains. The reading component analyzes target content to detect whether a watermark pattern is present. In applications where the watermark encodes information, the reader extracts this information from the detected watermark.
One challenge to the developers of watermark embedding and reading systems is to ensure that the watermark is detectable even if the watermarked media content is corrupted in some fashion. The watermark may be corrupted intentionally, so as to bypass its copy protection or anti-counterfeiting functions, or unintentionally through various transformations that result from routine manipulation of the content (e.g., digital to analog conversion, geometric distortion compression, etc.). In the case of watermarked images, such manipulation of the image may distort the watermark pattern embedded in the image. In general, the geometric distortion may result in some linear or non-linear geometric transformation. An affine transformation encompasses various linear transformations, including scale, translation, rotation, differential scale, and shear.
To accurately detect and read the watermark, it is helpful to determine the parameters of this affine transformation. The reader may then use these parameters to adjust the corrupted image to approximate its original state and then proceed to read the information content represented in the watermark.
Watermarks are often difficult to detect and read in corrupted media, particularly if the original un-marked media is not available to assist in the detection and reading process. Thus, there is a need to develop techniques for accurately detecting the presence and orientation of a watermark in corrupted media where the original media is not available.
In some applications, it is useful to determine whether a media signal, such as an audio, image or video signal has been transformed, and if so, how it has been transformed. Methods capable of determining alteration of a signal are useful in a variety of applications, including forensics and encoding auxiliary messages in media. In some applications, there is a need to be able to restore a media signal to its original state in addition to detecting alteration.
The invention provides a method and system of determining a transformation of a media signal subsequent to the encoding of an embedded code signal into the media signal. It also provides a method and system to determine the orientation of the embedded code signal in a media signal after the media signal has been transformed. The invention applies to various types of media signals, including image, video and audio signals.
One aspect of the invention is a method of determining a transformation of a media signal having an embedded code signal. The method performs a logarithmic sampling of the media signal to create a sampled signal in which scaling of the media signal is converted to translation in the sampled signal. It then computes the translation of the embedded code signal in the sampled signal to determine scaling of the media signal subsequent to the encoding of the embedded signal in the media signal.
The embedded code signal may be implemented in a variety of ways. In one implementation, the embedded code signal comprises a set of impulse functions in a frequency domain. In particular, the impulse functions may be in a Fourier domain, or some other transform domain such as wavelet, Discrete Cosine Transform, etc. For some applications, the impulse functions have random or pseudo-random phase. When the impulse functions have random phase, they tend to make the embedded code signal imperceptible or less perceptible. For instance, the embedded code signal may be an imperceptible or substantially imperceptible digital watermark in an image or audio signal.
Using the embedded code signal""s phase attributes, a detection process can determine the position of the embedded code signal or the translation of the media signal in which it is embedded. For example, the detection process may be used to determine a shift, offset, or cropping of the media signal after it has been encoded with the embedded code signal. In particular, the detection process may perform phase matching between the code signal and a media signal suspected of containing an embedded code signal (a suspect signal). One form of phase matching is a matched filtering process between the code signal and the suspect media signal in the spatial or temporal domain. This process may be performed on one dimensional signals such as audio signals, or two or more dimensional signals like images and video.
The logarithmic sampling may be performed directly on the media signal or after it has been converted to a transform domain. For example, one implementation performs the sampling on frequency domain data of the media signal. Depending on the nature of the media signal and the application, the sampling may be performed in two or more dimensions. A two-dimensional signal, such as an image, may be logarithmically sampled in each of the two dimensions to determine scaling in each dimension. A three dimensional signal, such as a video sequence, may be logarithmically sampled in three dimensions. After sampling, matched filtering, or other forms of filtering, may be used to determine the translation of the embedded code signal in the sampled signal in each of the dimensions. The extent of translation in the sampled signal corresponds to scaling in the media signal.
Polar sampling may also be used to convert rotation of a media signal into translation in polar coordinates. Once converted in this manner, matched filtering may be used to determine translation of the embedded code signal in the sampled signal. The translation in polar coordinates provides the angle of rotation of the media signal subsequent to encoding of the embedded code signal.
Logarithmic sampling may also be performed in combination with a polar sampling. The logarithmic or polar sampling may be performed on the media signal directly (e.g., in its native spatial, or temporal domain) or on frequency domain or other transform domain data of the media signal. Similarly, the embedded code signal, or components of it, may be defined in the spatial or frequency domain, or in a transform domain. One example of an embedded code signal is a watermark signal with fixed attributes that can be located via matched filtering in the sampled media signal.
Further advantages and features of the invention will become apparent with reference to the following detailed description and accompanying drawings.