A “video sequence” is a series of images that typically capture (or simulate) motion, life, action, movement, etc. The video sequences are typically accompanied by audio. Watermarking a video sequence presents a series of significant challenges that are greater than those faced when watermarking other “digital goods.”
“Digital goods” is a generic label for electronically stored or transmitted content. Examples of digital videos include images, audio clips, video, digital film, multimedia, software, and data.
A video sequence is a specific type of digital videos. It may also be called a “digital video,” “video signal,” “video bitstream,” “video stream,” “streaming video,” “video media,” “video object,” “video,” “digital film,” “digital movie,” and the like. The emerging field of “digital film” is a high-quality form of video.
Digital videos are often distributed to consumers over private and public networks—such as Intranets and the Internet. In particular, they may be “broadcast” via streaming video of a live or recorded event. In addition, these videos are distributed to consumers via fixed computer readable media, such as a compact disc (CD-ROM), digital versatile disc (DVD), soft magnetic tape, soft magnetic diskette, or hard magnetic disk (e.g., a preloaded hard drive).
Digital videos maybe stored in one or many different formats. Some of the more common multimedia file formats include: MPEG, Video of Windows®, QuickTime™, RealVideo™, Shockwave™, and the like.
Unfortunately, it is relatively easy for a person to pirate the pristine digital content of a digital video at the expense and harm of the content owners. Content owners include the content author, artist, publisher, developer, distributor, etc. The content-based industries (e.g., entertainment, music, film, television, etc.) that produce and distribute content are plagued by lost revenues due to digital piracy.
Modern digital pirates effectively rob content owners of their lawful compensation. Unless technology provides a mechanism to protect the rights of content owners, the creative community and culture will be impoverished.
Watermarking
Watermarking is one of the most promising techniques for protecting the content owner's rights of a digital video. Generally, watermarking is a process of altering the digital video such that its perceptual characteristics are preserved. More specifically, a “digital watermark” (or simply “watermark”) is a pattern of bits inserted into a digital video that may be used to identify the content owners and/or the protected rights.
Watermarks are designed to be completely invisible or, more precisely, to be imperceptible to humans and statistical analysis tools. Ideally, a watermarked video signal is perceptually identical to the original video signal.
A watermark embedder (i.e., encoder) embeds a watermark into a digital video. It typically uses a secret key to embed the watermark. A watermark detector (i.e., decoder) extracts the watermark from the watermarked digital video.
To detect the watermark, some watermarking techniques require access to the original unmarked digital video or to a pristine specimen of the marked digital video. Some, watermarking techniques are “blind.” This means that they do not require access to the original unmarked digital video or to a pristine specimen of the marked digital video. Of course, these “blind” watermarking techniques are desirable when the watermark detector is publicly available.
Before detection, a watermarked signal may undergo many possible changes by users and by the distribution environment. These changes may include unintentional modifications, such as noise and distortions. Moreover, the marked signal is often the subject of malicious attacks particularly aimed at disabling the detection of the watermark.
Ideally, a watermarking technique should embed detectible watermarks that resist modifications and attacks as long as they result in signals that are of perceptually the same quality. A watermarking technique that is resistant to modifications and attacks may be called “robust.” Aspects of such techniques are called “robust” if they encourage such resistance.
Generally speaking, a watermarking system should be robust enough to handle unintentional noise introduction into the signal (such noise my be introduced by A/D and D/A conversions, compressions/decompressions, data corruption during transmission, etc.)
Furthermore, a watermarking system should be robust enough and stealthy enough to avoid purposeful and malicious detection, alternation, and/or deletion of the watermark. Such attack may use a “shotgun” approach where no specific watermark is known or detected (but is assumed to exist) or may use “sharp-shooter” approach where the specific watermark is attacked.
Those of ordinary skill in the art are familiar with conventional techniques and technology associated with watermarks, watermark embedding, and watermark detecting. In addition, those of ordinary skill in the art are familiar with the typical problems associated with proper watermark detection after a marked signal has undergone changes (e.g., unintentional noise and malicious attacks).
Herein, such a digital watermark may be simply called a “watermark.” Generically, it may be called an “information pattern of discrete values.”
Desiderata of Watermarking Technology
Watermarking technology has several highly desirable goals (i.e., desiderata) to facilitate protection of copyrights of video content publishers. Below are listed several of such goals.
Perceptual Invisibility. The embedded information should not induce perceptual changes in the video quality of the resulting watermarked signal. The test of perceptual invisibility is often called the “golden eyes and ears” test.
Statistical Invisibility. The embedded information should be quantitatively imperceptive for any exhaustive, heuristic, or probabilistic attempt to detect or remove the watermark. The complexity of successfully launching such attacks should be well beyond the computation power of publicly available computer systems. Herein, statistical invisibility is expressly included within perceptual invisibility.
Tamperproofness. An attempt to remove the watermark should damage the value of the video well above the hearing threshold.
Cost. The system should be inexpensive to license and implement on both programmable and application-specific platforms.
Non-disclosure of the Original. The watermarking and detection protocols should be such that the process of proving video content copyright both in-situ and in-court, does not involve usage of the original recording.
Enforceability and Flexibility. The watermarking technique should provide strong and undeniable copyright proof. Similarly, it should enable a spectrum of protection levels, which correspond to variable video presentation and compression standards.
Resilience to Common Attacks. Public availability of powerful digital video editing tools imposes that the watermarking and detection process is resilient to attacks spawned from such consoles.
Hard-to-Break. A watermark is “hard-to-break” when it is “extremely hard” for an attacker to break the watermark even though the attacker may know watermarking technique. Here, “breaking” refers to successfully modifying or removing the watermark. In particular, it should be nearly impossible to break the mark under almost all practical situations even if an attacker has a supercomputer.
Watermark Circumvention
In general, there are two common classes of malevolent attacks:                1. De-synchronization of watermark in digital video signals. These attacks alter video signals in such a way to make it difficult for the detector to identify the location of the encoded watermark codes.        2. Removing or altering the watermark. The attacker discovers the location of the watermark and intentionally alters the video clip to remove or deteriorate a part of the watermark or its entirety.Particular Video Watermarking Challenges        
A video is a series of video “frames.” Each frame of the video is an image. Since videos are a series of images, one way to watermark a video is to embed a watermark (wholly or partially) in each frame (or a significant number) of the video.
As mentioned earlier, watermarking a video sequence presents a series of significant challenges that are greater than those faced when watermarking other “digital goods.” Particular examples of these challenges include perceptual invisibility and resistance to de-synchronization attacks. Although watermarking other types of media (e.g., images and audio) also faces these challenges, the problems of perceptual invisibility and resistance to de-synchronization are particularly acute and specifically unique for videos.
De-Synchronization Attacks
The watermark (or portions thereof) may be embedded into each frame of the video. However, the chances of a digital pirate discovering the watermark increases as the watermark repetition increases. Embedding the watermark (or portions thereof) in each frame is also undesirable because it provides convenient range for the pirate to focus her efforts. In addition, it provides potentially thousands of bounded targets (i.e., frames) containing the same hidden data (i.e., the watermark). With this much bounded information, a digital pirate has a good chance of determining the watermark.
To overcome this problem, watermarks (or portions thereof) may be selectively encoded in individual frames or groups of frames within the video. To find the encoded information later, the detector typically must be synchronized along the temporal axis so that it know where (or when) to look for the watermarks. Digital pirates know this. A de-synchronization attack is one of their most watermark-fatal arrows in their quiver. In addition, de-synching may occur unintentionally particularly when video signal is transmitted.
Resisting de-synchronization is a particularly difficult challenge in the video realm. A pirate may, for example, do any of the following to de-synch a video:                remove frames;        add new frames (such as commercials);        add copied frames (copies of adjacent frames);        change frames/sec rate;        rearrange frames.        
If this de-synch attack splits a series of frames in which the full watermark is encoded, then the watermark may go undetected. If this attack manages to remove the isolated frames including the watermarks, then the watermark may go undetected.
Perceptual Invisibility
As mentioned above, a watermark should be perceptually invisible (which include statistically invisible) within the signal. Achieving perceptual invisibility is a particularly difficult challenge in the video realm.
Typically, a series of successive frames have one or more common sections. These common sections contain the same image data. For example, if the camera capturing the video frames is fixed on relatively stationary objects or people, then the vast majority of each frame will be identical. Typically, if the camera is fixed, the background remains identical in each frame.
If the watermark (or portions thereof) is not encoded in every frame of the video, then some frames will have no portion of the watermark encoded therein. Consequently, there will be a transition between encoded frames and non-encoded frames. Typically, perceptible “flicker” occurs at that transition. Flicker is the perceptible manifestation of the transition. This problem is particular to video.
Flicker may be visible to the human eye. If not, it may be noticeable by statistical analysis tools. Since watermark encoding introduces “noise” into a frame, the transition from “noisy” to “noiseless” frame produces perceptible flicker in the common sections of the frames of that transition.
Armed with the knowledge of flickering, a digital pirate can focus her attack on the frames in and around transitions.
Framework to Thwart Attacks
Accordingly, there is a need for a new framework for hiding and detecting watermarks in digital video signals that is effective against unintentional and intentional modifications. In particular, the framework should be resistant to de-synchronization. The framework should possess several attributes that further the desiderata of watermark technology, described above. In particular, it should be perceptually invisible; thus, it should minimize or eliminate flicker.