The present invention relates to the detection of one or more watermarks embedded in frames of a moving image and, more particularly, the present invention relates to methods and/or apparatuses for detecting a watermark that are resistant to arbitrary temporal frame deformation and frame rate conversion.
It is desirable to the publishers of content data, such as movies, video, music, software, and combinations thereof to prevent or deter the pirating of the content data. The use of watermarks has become a popular way of thwarting pirates. A watermark is a set of data containing a hidden message that is embedded in the content data and stored with the content data on a storage medium, such as film, a digital video disc (DVD), a compact disc (CD), a read only memory (ROM), a random access memory (RAM), magnetic media, etc. The hidden message of the “embedded watermark” is typically a copy control message, such as “do not copy” or “copy only once.”
In the movie industry, the hidden message of the watermark may be an identifier of a particular location (e.g., theater) at which a movie is shown. If the management of the theater knowingly or unknowingly permits pirate to record the movie, the identity of that theater may be obtained by detecting the hidden message of the watermark embedded in a pirated copy of the movie. Corrective action may then be taken.
With respect to watermark detection, when a quantum of data comprising the content data and the embedded watermark is correlated with a reference watermark, a determination can be made as to whether the embedded watermark is substantially similar to, or the same as, the reference watermark. If a high correlation exists, then it may be assumed that the message of the embedded watermark corresponds to a message of the reference watermark. For example, the quantum of data may be a frame of data, such as video data, in which pixel data of the frame of video data has been embedded with a watermark (“the embedded watermark”). Assuming that the frame of data has not been distorted in some way, when a reference watermark that is substantially the same as the embedded watermark is correlated with the frame of video data, a relatively high output is obtained. This is so because a one-for-one correspondence (or registration) between the data of the embedded watermark and the data of the reference watermark will tend to increase a correlation computation. Conversely, if the embedded watermark contained in the frame of video data has been altered in a way that reduces the one-for-one correspondence between the embedded watermark and the reference watermark, the correlation will yield a relatively low result.
Often, the correlation computation involves performing a sum of products of the data contained in the frame of data and the data of the reference watermark. Assuming that the frame of data and the reference watermark include both positive values and negative values, the sum of products will be relatively high when the data of the embedded watermark aligns, one-for-one, with the data of the reference watermark. Conversely, the sum of products will be relatively low when the data of the embedded watermark does not align with the reference watermark.
A data detector, such as a standard correlation detector or matched filter, may be used to detect the presence of an embedded watermark in a frame of content data, such as video data, audio data, etc. The original or reference position of the embedded watermark is implicitly determined by the design of the hardware and/or software associated with the detector. These types of correlation detectors are dependent upon specific registration (i.e., alignment) of the embedded watermark and the reference watermark.
Pirates seeking to wrongfully copy content data containing an embedded watermark (e.g., one that proscribes copying via a hidden message: “do not copy”) can bypass the embedded watermark by distorting the registration (or alignment) between the embedded watermark and the reference watermark. By way of example, a frame of content data containing an embedded watermark may be slightly rotated, resized, and/or translated from an expected position to a position that would prevent a one-for-one correspondence (perfect registration) between the embedded watermark and the reference watermark. Editing and copying equipment may be employed to achieve such distortion.
An embedded watermark contained in a pirated copy of a movie may also have been distorted. A pirate may intentionally distort the embedded watermark as discussed above or the distortion may unintentionally occur during the recording process at a theater. For example, if the pirated copy was recorded, using a video camera, several factors can cause distortion including (i) shaking of the video camera (especially if it is handheld); (ii) misalignment of the video camera with the projected movie (e.g., when the video camera is on a tripod); (iii) lens distortion in the video camera (intentional and/or non-intentional); and (iv) projection screen abnormalities (e.g., curvature).
Further, inadvertent distortion of the embedded watermark may occur during the normal processing of the content data (containing an embedded watermark) in a computer system or consumer device. For example, the content data (and embedded watermark) of a DVD may be inadvertently distorted while undergoing a formatting process, e.g., that converts the content data from the European PAL TV system to the US NTSC TV system, or vice versa. Alternatively, the content data and embedded watermark may be distorted through other types of formatting processes, such as changing the format from a wide-screen movie format to a television format. Indeed, such processing may inadvertently resize, rotate, and/or translate the content data and, by extension, the embedded watermark, rendering the embedded watermark difficult to detect. Such editing may also create temporal distortions in movie frames, and may require frame rate conversion or frame compression that distorts or destroys watermark information.
Different types of watermark systems exist that purport to be robust to resizing and translation. One such type of watermark system typically embeds the watermark in a way that is mathematically invariant to resizing and translation. The detector used in this type of system does not have to adjust to changes in the position and/or size of the embedded watermark. Such a system is typically based on Fourier-Mellin transforms and log-polar coordinates. One drawback of such a system is that it requires complex mathematics and a particularly structured embedded watermark pattern and detector. This system cannot be used with pre-existing watermarking systems.
Another type of prior art watermark system uses repetitive watermark blocks, wherein all embedded watermark blocks are identical. The watermark block in this type of system is typically large and designed to carry the entire copy-control message. The repetition of the same block makes it possible to estimate any resizing of the embedded watermark by correlating different portions of the watermarked image and finding the spacing between certain positions. The resizing is then inverted and the reference block is correlated with the adjusted image to find the embedded watermark and its position simultaneously. An example of this system is the Philips VIVA/JAWS+watermarking system. A disadvantage of such a system is that the design of the embedded watermark must be spatially periodic, which does not always occur in an arbitrary watermarking system.
Yet another type of watermarking system includes an embedded template or helper pattern along with the embedded watermark in the content data. The detector is designed to recognize the reference location, size and shape of the template. The detector attempts to detect the template and then uses the detected position of the template to estimate the actual location and size of the embedded watermark. The system then reverses any geometric alterations so that the correlation detector can detect and interpret the embedded watermark. This system is disadvantageous, however, since the templates tend to be fragile and easily attacked.
In the present inventor's U.S. Pat. No. 6,563,937, entitled METHOD AND APPARATUS TO DETECT WATERMARK THAT ARE RESISTANT TO ARBITRARY DEFORMATIONS, assigned to the assignee of the present application and hereby incorporated by reference in its entirety, a method to use temporally-varying watermark patterns to detect and estimate geometric deformations in digital video is described. That temporally-varying watermarking system overcomes the effect of many deformations that can disadvantageously defeat the detection of previous types of watermarks in video. Overcoming the effect of video deformations is especially important for watermarks designed to defeat or track illegal copying.
The temporally-varying embedder embeds a rectangular array of substantially identical noise blocks into each frame. Within each frame, the noise block pattern repeated is identical, but the noise block pattern to be repeated typically varies from frame to frame. Thus, for each sequence of N noise blocks, each noise block is repeated in one of N frames, after which the sequence of N noise blocks repeats in the next N frames. Generally, the sequence of noise blocks used in the frames repeats over the entire video. Also, all the noise blocks have the same size, and are aligned, tiled, and/or overlayed over all the frames.
One location is typically selected to be the center of all the noise blocks. At this location, a sequence of values is created. If the temporal correlation of this sequence is computed with respect to an aligned set of watermarked frames, a frame with bright spots at the centers of all the noise blocks in the watermarked frames will advantageously be derived.
A temporal correlator is a specialized temporal filter—for each pixel position in each frame, a linear weighted combination of the values in that pixel position in neighboring frames is computed. Since this should be done for typically every pixel position in the frame, the corresponding linear combination of the neighboring video frames, pixel by pixel, is advantageously computed.
In the case that it is known that the embedded watermark and the temporal sequence of coefficients are synchronized, so that frame 1 corresponds to coefficient 1, frame 2 to coefficient 2, etc., and that the sequences of embedded watermark patterns repeats over time, then computation time in the detector and storage space in the detector are advantageously reduced.
Similarly, in the case that the temporal sequence of watermark patterns consists of N frames, and therefore that the coefficient sequence has length N, then the input video is advantageously divided into blocks of N frames. The correct linear combination of the frames in each block can than be computed to produce one output frame for the block. Advantageously buffering of input frames is not needed, because of the block-by-block processing. Rather than buffering, the weighted frames are accumulated into one output frame buffer. This frame buffer is reset to all zeros at the start of each block of frames.
However, this process is rendered more complex if it is not known whether the sequence of watermark patterns and the sequence of coefficients are synchronized.
Furthermore, if there is an unknown offset shift between the start of the sequence of coefficients and the sequence of watermark patterns, the combination of N frames for each input frame should be computed until a frame with bright spots is found. Only then is the offset shift known, and only then can the block-of-frames method described above be employed.
If, in addition, though, the frame rate of the video has been changed, then the temporal correlation will not work at all. Nor will the process for determining the offset shift via the combination of N frames for each input frames typically be effective. Thus, an effective solution for recovering watermarks from frame rate conversion and temporal shifts is needed.