With the recent advances in Internet content distribution, including peer-to-peer networks and real-time video streaming systems, unauthorized distribution of proprietary content has become rampant. The point of unauthorized distribution is often an authorized viewer, such as a viewer in a cinema where pirated copies are made with camcorders, or a viewer at home receiving a cable or satellite broadcast with a set-top-box TV decoder whose output is captured and re-encoded into a video file.
Although authorized broadcasts may be over channels generally considered secure, a content pirate could use a low cost video capture device, with analog or digital inputs, and a standard home PC, to capture, encode and upload pirated content over the internet to users not entitled to have access to the content. New encoding standards and broadband technologies have made this type of piracy even more attractive and easier to implement than ever before.
To assist in the prevention of unauthorized distribution of video content, it would be advantageous to be able to identify the point of unauthorized distribution of a video or the point at which a video was made available to an unauthorized distributor. After identification of the source, measures could be taken to prevent further unauthorized distribution. Such identification could be accomplished through the embedding of identifying information, related to the source, in the video content.
Embedding data in video is a rich field both in academic research and commercial inventions. An example of data embedding is watermarking. Covert watermarking in the compressed domain is well known in the art, as are overt watermarks that appear as bitmaps on top of displayed video, and steganographic watermarks.
Embedding data covertly in high-definition video is particularly difficult. This is generally because high-definition broadcasts and televisions for viewing them have such high resolution that they can cause watermarks that were intended to be covert, to be seen by the viewer, mainly as artifacts and noise. Not only would that annoy the average viewer, but it would also potentially expose the watermarks to attacks by pirates.
There are, basically, three television (TV) broadcast formats (PAL, SECAM and NTSC) in use today, each supporting various resolutions, phases and frame rates. Each half of a frame is called a field, and the rate at which fields are transmitted is one of the fundamental parameters of a video system. Usually, the rate is closely related to the frequency at which the electric power grid operates, to avoid the appearance of a flicker resulting from the beat between the television screen and nearby electric lights. Digital, or “fixed pixel”, displays are generally progressive scan and must deinterlace an interlaced source.
In cinemas, 24 frames per second are projected at extremely high resolution, typically using 35 mm or 70 mm film. European TV generally broadcasts 25 frames per second (using 50 fields) and American and Far East TV generally broadcast 30 frames per second (using 60 fields). A casual human viewer is generally not sensitive to the differences between the various TV and cinema systems' frame rates.
In order to meet broadcast standards for TV, films made for cinematic presentation, which contain 24 frames per second, are typically converted to 25 or 30 frames per second, prior to broadcast, through the use of field duplication, speed-up, or 3:2 pull down, all techniques which are well known in the art. Some such frame rate conversions are performed utilizing telecine systems, also known as TK systems. Generally, when a film is encoded using 3:2 pull down, advanced codecs can identify the fields to be duplicated and tag them as such instead of encoding the same field twice. Later, prior to rendering, a decoder acts in the reverse manner and creates the duplicated field.
Digital TV and digital cinemas use separate video and audio streams, although both can be multiplexed into a single stream. At the time of presentation, a projecting device (decoder) synchronizes these streams in order to provide an enjoyable experience. This synchronization can be extremely accurate (within less than a frame), and exceeds a casual viewer's ability to distinguish an ‘out of sync’ scenario.
The following references are believed to reflect the present state of the art:
U.S. Pat. No. 6,760,463 to Rhoads;
U.S. Pat. No. 6,721,440 to Reed et al.;
U.S. Pat. No. 5,636,292 to Rhoads;
U.S. Pat. No. 5,768,426 to Rhoads;
U.S. Pat. No. 5,745,604 to Rhoads;
U.S. Pat. No. 6,404,898 to Rhoads;
U.S. Pat. No. 7,058,697 to Rhoads;
U.S. Pat. No. 5,832,119 to Rhoads;
U.S. Pat. No. 5,710,834 to Rhoads;
U.S. Pat. No. 7,020,304 to Alattar et al.;
U.S. Pat. No. 7,068,809 to Stach;
U.S. Pat. No. 6,381,341 to Rhoads;
U.S. Pat. No. 6,950,532 to Schumann, et al.;
U.S. Pat. No. 7,035,427 to Rhoads;
WO 02/07362 of Digimarc Corp.;
WO 05011281A1 of Koninklijke Philips Electronics N.V.;
US Patent Application 20020027612 of Brill, et al.;
US Patent Application 20070071037A1 of Abraham, et al.;
Patent Abstracts of Japan for JP11075055;
Digital Watermarking of Visual Data: State of the Art and New Trends, by M. Barni, F. Bartolini and A. Piva., Congrès Signal processing X: Theories and Applications (Tampere, 4-8 Sep. 2000), EUPSICO 2000: European Signal Processing Conference No 10, Tampere, Finland (Apr. 9, 2000);
Multichannel Watermarking of Color Images, by M. Barni, F. Bartolini and A. Piva., published in IEEE Transactions on Circuits and Systems for Video Technology, Vol. 12, No. 3, March 2002;
Digital Watermarking for 3D Polygons using Multiresolution Wavelet Decomposition, by Satoshi Kanai, Hiroaki Date, and Takeshi Kishinami, available on the World Wide Web at citeseer.ist.psu.edu/504450.html;
MPEG-1 standards document: BS EN ISO/IEC 11172-1:1993 Information technology—Coding of moving pictures and associated audio for digital storage media at up to about 1.5 Mbit/s;
MPEG-2 standards document: BS EN ISO/IEC 13818-1: 1997 Information technology—Generic coding of moving pictures and associated audio information;
MPEG-4 standards document: BS EN ISO/IEC 14496-1:2004—Information technology—Coding of audio-visual objects—Part 1: Systems;
CANDID: Comparison Algorithm for Navigating Digital Image Databases, Kelly, P. M. Cannon, T. M., Proceedings of the Seventh International Working Conference on Scientific and Statistical Database Management, 1994, pages 252-258;
Video Fingerprinting and Encryption Principles for Digital Rights Management, by D. Kundur and K. Karthik, Proceedings of the IEEE, Vol. 92, No. 6, June 2004; and
http://en.wikipedia.org/wiki/Broadcast_television_systems.