1. Field of the Invention
The present invention relates to a system and method for fingerprinting a video and subsequently using that fingerprint to identify copies of original video files.
2. Related Art
With the growth in both bandwidth and peer-to-peer applications, there has been an explosion in the number of movie files being copied across the Internet. Some sources claim that up to sixty percent of traffic is now made up of peer-to-peer traffic, and a significant amount of this is movie files, which have been illegally copied from authentic sources. As a result, copyright owners are losing a large amount of money through lost royalty payments. Legal action against the peer-to-peer network owners does little to discourage copyright infringement since if one network is closed down then another one appears.
Conventionally, there are four sources of illegal movie downloads available on peer-to-peer systems. A first example is derived from a hand-held video camera that is taken into a movie theatre and used to record the movie as it is being projected on to the screen. “Starwars Episode 1: The Phantom Menace” was copied in this manner and available on the internet within one week of the US box office release. These recordings typically have the lowest audio and visual quality. Alternatively, movies may be recorded from broadcast television. These typically already include visual watermarks in the form of television station logos. This type of video file can range in quality since it may have been recorded from a noisy analogue signal. However, increasingly it will be taken from a digital source such as a cable or satellite station broadcasting using MPEG-2. Another source of illegal movie files is referred to as a “screener”. These are copies of previews of an unreleased movie that are issued to critics and censors, for example. Finally, an illegal copy of a film may be made from retail versions of digital versatile discs (DVDs), which have been sold to the public. These are typically the highest quality copies as a large amount of time can be taken to encode the media from a clean source.
To counteract the many and varied ways for copying and distributing movies illegally, various digital fingerprinting techniques have been developed. Digital fingerprinting is a form of digital watermarking, which typically involves embedding data within a document to uniquely identify the copyright holder. Known techniques for video watermarking have emerged from techniques developed for still images. One of these techniques treats the video as a series of still images and adapts existing watermarking schemes from still images to video. For example, the technique identified in “Real-time labelling of MPEG-2 compressed video” by Langelaar et al, Journal of Visual Communication and Image Representation 9, 4 (1998), pp 256-270, the contents of which are incorporated herein by reference in their entirety, describes a method that has been extended to video by watermarking the I-frames of an MPEG stream. While such watermarking techniques inherit all the experience from still image based techniques, they are processor intensive and so not ideal for dealing with large quantities of data.
As well as using still images, the temporal aspect of video has been exploited in various watermarking techniques. For example the article “Watermarking of uncompressed and compressed video” by Hartung et al, Signal Processing 66, 3 (1998), pp 283-216, the contents of which are incorporated herein by reference in their entirety, describes a method in which a watermark is spread over a number of frames. This method permits higher robustness, although it can be computationally expensive. Another known technique considers the video as a compressed data stream. A technique of this nature is described in the article “Proposal of a watermarking technique for hiding/retrieving data in compressed and decompressed video” by Jordan et al, iso/iee July 1997 the contents of which are incorporated herein by reference in their entirety. This proposal describes a method that embeds watermark information into motion vectors of the encoded video. In particular, motion vectors pointing to flat areas are modified in a pseudorandom way. Advantages of this method are that it does not introduce any visible artefacts and the embedded information can be retrieved directly from the motion vectors as long as the video is in its compressed form. After the video has been decompressed, the video must be recompressed for detection of the watermark. This works because during the recompression process the motion vectors are found with a high enough probability to statistically recover the watermark. This is a simple mechanism that makes real-time detection possible. However, a problem with this approach is that the watermark may be tied to a specific video CODEC and therefore may not survive transcoding.
Just as there are people wishing to identify the owners of media there are groups that wish to obscure the origins, typically by attacking the watermark. In the article “Spread Spectrum Watermarking: Malicious attacks and counter attacks” SPIE Security and Watermarking of Multimedia Contents 99 (San Jose, Calif. 1999), the contents of which are incorporated herein by reference in their entirety, Hartung et al present four different types of attack upon digital watermarks. The first of these are so called ‘simple attacks’—sometimes also called ‘waveform attacks’ or ‘noise attacks’. These attempt to impair the detection of the embedded watermark without identifying and isolating it. Examples include linear and general non-linear filtering, waveform-based compression (JPEG, MPEG), addition of noise, addition of offset, cropping, quantisation in the pixel domain, conversions to analogue, and gamma correction. Other attacks include ‘detection-disabling attacks’—sometimes called ‘synchronization attacks’. These attempt to break the correlation and to make recovery of the watermark difficult if not impossible. This type of attack is performed normally by geometric distortion, for example zooming, shift in spatial or temporal direction, rotation, shear, cropping, pixel permutation, sub-sampling, removal or insertion of pixels or pixel clusters. In contrast, ‘ambiguity attacks’—sometimes called ‘deadlock’, ‘inversion’, ‘fake watermark’ or ‘fake original’ attacks—attempt to confuse the detection process by producing fake original data or fake watermarked data. An example is an attack that attempts to discredit the authority of the watermark by embedding one or more additional watermarks so that it is unclear which was first. The final category of attacks is ‘removal attacks’, which involve attempting to analyze and identify the watermark and then separate it from the host data. The watermark is then discarded. Examples of this form of attack are collusion attacks, denoising certain non-linear filter operations or compression attacks using synthetic modelling.
Benchmarking tools have been created to automate the process of evaluating watermarking techniques, see for example “Attacks on Copyright Marking Systems” by Petitcolas et al Information Hiding, Second International Workshop IH '98 (Portland Oreg., 1998) Auesmith D., (Ed), Springer-Verlag, pp 219-239, the contents of which are incorporated herein by reference in their entirety. These are non-hostile processing mechanisms that can be applied to video. They are not designed to obfuscate or remove watermarks but may have that side effect. These are the processing mechanisms to which video being transported over peer-to-peer networks will have normally been exposed. Photometric attacks are those that modify the pixels of the video in some way. For example, when an analogue television signal is broadcast, the pixels may be changed because of noise. This may alter the colour of some portions of the image. Spatial de-synchronisation also changes the pixels. This is performed when the display format of a video is changed. The most common examples of this are changes in aspect ratio (4:3, 16:9, etc.) or changes in the spatial resolution (PAL, NTSC or SECAM). Temporal de-synchronisation is another technique that changes pixels. This is performed when a modification is made to the frame-rate of the movie. This may be done to reduce the file size, for example a 30 frames per second (fps) movie being reduced to 24 fps or lower. Another example is video editing. This encompasses all operations that may be made by a video editor. A simple and common example is cut-insert-splice. This is when a cut is made in the video, a commercial is inserted and the remaining video is spliced together. Other possibilities are the addition of logos or subtitles, or the cutting of certain scenes from a film perhaps to enable broadcast before a watershed.
A problem with traditional methods for automatically identifying video is that they either rely on the video bit-stream remaining the same throughout its lifetime or require additional information to be stored within the video. Video streams are quite often re-encoded when either new or improved video CODECs are released, as in the case of DivX and XviD, or in order to reduce file size for easier transport across the network or to fit on certain media. This can completely defeat the identification process or increase the difficulty of identification as in the case of watermarking. In addition, since there are many video files infringing copyright that have not been encoded and released on to the Internet, it is too late to embed additional information into the video for identification purposes. Hence conventional techniques cannot be applied.