The present invention relates to the detection of one or more watermarks embedded in frames of a moving image and, more particularly, the present invention relates to methods and/or apparatuses for detecting a watermark that are resistant to arbitrary deformation of the watermark.
It is desirable to the publishers of content data, such as movies, video, music, software, and combinations thereof to prevent or deter the pirating of the content data. The use of watermarks has become a popular way of thwarting pirates. A watermark is a set of data containing a hidden message that is embedded in the content data and stored with the content data on a storage medium, such as film, a digital video disc (DVD), a compact disc (CD), a read only memory (ROM), a random access memory (RAM), magnetic media, etc. The hidden message of the xe2x80x9cembedded watermarkxe2x80x9d is typically a copy control message, such as xe2x80x9cdo not copyxe2x80x9d or xe2x80x9ccopy only once.xe2x80x9d
In the movie industry, the hidden message of the watermark may be an identifier of a particular location (e.g., theater) at which a movie is shown. If the management of the theater knowingly or unknowingly permits pirate to record the movie, the identity of that theater may be obtained by detecting the hidden message of the watermark embedded in a pirated copy of the movie. Corrective action may then be taken.
With respect to watermark detection, when a quantum of data comprising the content data and the embedded watermark is correlated with a reference watermark, a determination can be made as to whether the embedded watermark is substantially similar to, or the same as, the reference watermark. If a high correlation exists, then it may be assumed that the message of the embedded watermark corresponds to a message of the reference watermark. For example, the quantum of data may be a frame of data, such as video data, in which pixel data of the frame of video data has been embedded with a watermark (xe2x80x9cthe embedded watermarkxe2x80x9d). Assuming that the frame of data has not been distorted in some way, when a reference watermark that is substantially the same as the embedded watermark is correlated with the frame of video data, a relatively high output is obtained. This is so because a one-for-one correspondence (or registration) between the data of the embedded watermark and the data of the reference watermark will tend to increase a correlation computation. Conversely, if the embedded watermark contained in the frame of video data has been altered in a way that reduces the one-for-one correspondence between the embedded watermark and the reference watermark, the correlation will yield a relatively low result.
Often, the correlation computation involves performing a sum of products of the data contained in the frame of data and the data of the reference watermark. Assuming that the frame of data and the reference watermark include both positive values and negative values, the sum of products will be relatively high when the data of the embedded watermark aligns, one-for-one, with the data of the reference watermark. Conversely, the sum of products will be relatively low when the data of the embedded watermark does not align with the reference watermark.
A data detector, such as a standard correlation detector or matched filter, may be used to detect the presence of an embedded watermark in a frame of content data, such as video data, audio data, etc. The original or reference position of the embedded watermark is implicitly determined by the design of the hardware and/or software associated with the detector. These types of correlation detectors are dependent upon specific registration (i.e., alignment) of the embedded watermark and the reference watermark.
Pirates seeking to wrongfully copy content data containing an embedded watermark (e.g., one that proscribes copying via a hidden message: xe2x80x9cdo not copyxe2x80x9d) can bypass the embedded watermark by distorting the registration (or alignment) between the embedded watermark and the reference watermark. By way of example, a frame of content data containing an embedded watermark may be slightly rotated, resized, and/or translated from an expected position to a position that would prevent a one-for-one correspondence (perfect registration) between the embedded watermark and the reference watermark. Editing and copying equipment may be employed to achieve such distortion.
An embedded watermark contained in a pirated copy of a movie may also have been distorted. A pirate may intentionally distort the embedded watermark as discussed above or the distortion may unintentionally occur during the recording process at a theater. For example, if the pirated copy was recorded, using a video camera, several factors can cause distortion including (i) shaking of the video camera (especially if it is handheld); (ii) misalignment of the video camera with the projected movie (e.g., when the video camera is on a tripod); (iii) lens distortion in the video camera (intentional and/or non-intentional); and (iv) projection screen abnormalities (e.g., curvature).
Further, inadvertent distortion of the embedded watermark may occur during the normal processing of the content data (containing an embedded watermark) in a computer system or consumer device. For example, the content data (and embedded watermark) of a DVD may be inadvertently distorted while undergoing a formatting process, e.g., that converts the content data from the European PAL TV system to the US NTSC TV system, or vice versa. Alternatively, the content data and embedded watermark may be distorted through other types of formatting processes, such as changing the format from a wide-screen movie format to a television format. Indeed, such processing may inadvertently resize, rotate, and/or translate the content data and, by extension, the embedded watermark, rendering the embedded watermark difficult to detect.
Different types of watermark systems exist that purport to be robust to resizing and translation. One such type of watermark system typically embeds the watermark in a way that is mathematically invariant to resizing and translation. The detector used in this type of system does not have to adjust to changes in the position and/or size of the embedded watermark. Such a system is typically based on Fourier-Mellin transforms and log-polar coordinates. One drawback of such a system is that it requires complex mathematics and a particularly structured embedded watermark pattern and detector. This system cannot be used with pre-existing watermarking systems.
Another type of prior art watermark system uses repetitive watermark blocks, wherein all embedded watermark blocks are identical. The watermark block in this type of system is typically large and designed to carry the entire copy-control message. The repetition of the same block makes it possible to estimate any resizing of the embedded watermark by correlating different portions of the watermarked image and finding the spacing between certain positions. The resizing is then inverted and the reference block is correlated with the adjusted image to find the embedded watermark and its position simultaneously. An example of this system is the Philips VIVA/JAWS+watermarking system. A disadvantage of such a system is that the design of the embedded watermark must be spatially periodic, which does not always occur in an arbitrary watermarking system.
Yet another type of watermarking system includes an embedded template or helper pattern along with the embedded watermark in the content data. The detector is designed to recognize the reference location, size and shape of the template. The detector attempts to detect the template and then uses the detected position of the template to estimate the actual location and size of the embedded watermark. The system then reverses any geometric alterations so that the correlation detector can detect and interpret the embedded watermark. This system is disadvantageous, however, since the templates tend to be fragile and easily attacked.
Accordingly, there is a need in the art for a new method and/or system for detecting an embedded watermark in one or more frames of data that is robust despite arbitrary distortion, e.g., rotation, resizing, translation, and/or deformations.
In accordance with one or more aspects of the invention, a method and/or apparatus is capable of detecting a watermark among a plurality of reproduced frames of data, the reproduced frames of data having been derived from respective original frames of data includes: adding at least some of the reproduced frames of data together on a data point-by-data point basis to obtain an aggregate frame of data points; selecting peak data points of the aggregate frame of data points; computing correction information from deviations between the positions of the peak data points within the aggregate frame and expected positions of those peak data points; modifying positions of at least some of the data of at least some of the reproduced frames of data using the correction information such that those reproduced frames of data more closely coincide with respective ones of the original frames of data; and detecting the watermark from among the modified reproduced frames of data.
The marker data points within each of the original frames of data are located at substantially the same relative positions and the reproduced marker data points within each of the reproduced frames of data are located at substantially the same relative positions. Preferably, the set of marker data points are arranged in a grid. Each peak data point of the aggregate frame of data points corresponds to a sum of the reproduced marker data points that are located at substantially the same relative position within respective ones of at least some of the M reproduced frames of data. The expected positions of the peak data points within the aggregate frame of data points are the corresponding positions of the marker data points within the original frames of data.
Preferably, the method and/or apparatus further includes: grouping the peak data points and their associated reproduced marker data points and marker data points into respective sets of three or more; comparing the respective positions of the peak data points of each set with the positions of the associated set of marker data points; computing respective sets of correction information based on the comparison of the sets of peak data points and marker data points, each set of correction information corresponding to a respective area within each of the reproduced frames of data circumscribed by the reproduced marker data points associated with the peak data points of the set of correction information; and modifying the positions of the data in each of the respective areas of at least one of the reproduced frames of data in accordance with the associated sets of correction information.
In accordance with at least one further aspect of the present invention, a method and/or apparatus is capable of detecting a watermark among a plurality of reproduced frames of data, the reproduced frames of data having been derived from respective original frames of data, N of the reproduced frames of data each including a plurality of reproduced blocks of noise data corresponding with blocks of noise data distributed within N of the original frames of data. The method and/or apparatus includes: deriving peak data points from the reproduced blocks of noise data of the N reproduced frames of data, the peak data points being positioned within an aggregate frame of data points; computing correction information from deviations between the positions of the peak data points within the aggregate frame and expected positions of those peak data points; modifying positions of at least some of the data of at least some of the reproduced frames of data using the correction information such that those reproduced frames of data more closely coincide with respective ones of the original frames of data; and detecting the watermark from among the modified reproduced frames of data.
The step of deriving the peak data points preferably includes: selecting one of the noise data of one of the blocks of noise data of an i-th one of the N frames of the original frames of data, where i=1, 2, . . . N; multiplying the data of an i-th one of the N frames of the reproduced frames of data by the selected one of the noise data to produce an i-th modified reproduced frame of data; and summing the modified reproduced frames of data on a point-by-point basis to obtain the aggregate frame of data points, wherein the peak data points are those having substantially higher magnitudes than other data points of the aggregate frame of data.
The method preferably further includes: grouping into respective sets of three or more: (i) the peak data points, (ii) the reproduced data points of the N reproduced frames of data at relative positions corresponding to the peak data points, and (iii) the associated selected noise data points; comparing the respective positions of the peak data points of each set with the positions of the associated set of selected noise data points; computing respective sets of correction information based on the comparison of the sets of peak data points and noise data points, each set of correction information corresponding to a respective area within each of the reproduced frames of data circumscribed by the reproduced data points of the N reproduced frames of data at relative positions corresponding to the peak data points; and modifying the positions of the data in each of the respective areas of at least one of the reproduced frames of data in accordance with the associated sets of correction information.