The present invention relates generally to methods, apparatus, and systems for enhancing the robustness of watermark extraction from digital multi-media content.
Digital watermarks are substantially imperceptible signals embedded into a host signal The host signal may be any one of audio, still image, video or any other signal that may be stored on a physical medium, transmitted or broadcast from one point to another or received and exhibited using a variety of display means such as monitors, movie screens, audio speakers or print medium. Digital watermarks are designed to carry auxiliary information without substantially affecting fidelity of the host signal, or without interfering with normal usage of the host signal. For this reason, digital watermarks are sometimes used to carry out covert communications, where the emphasis is on hiding the very presence of the hidden signals. The main applications of digital watermarks include prevention of unauthorized usage (i.e., duplication, playing and dissemination) of copyrighted multi-media content, proof of ownership, authentication, tampering detection, broadcast monitoring, transaction tracking, audience measurement and triggering of secondary activities such as interacting with software programs or hardware components.
The above list of applications is not intended to be exhaustive as many other present and future systems can benefit from co-channel transmission of main and auxiliary information. An example of such a system is one that utilizes a digital watermarks to carry auxiliary informational signals; these signals may convey spatial coordinates (e.g., GPS coordinates) of an apparatus, or timestamps indicating the exact time of generation and/or transmission of the composite host and watermark signals or any other information related or unrelated to the host signal. Alternatively, digital watermarks may carry information about the content, such as caption text, full title, artist name, and instructions on how to purchase the content. Other applications of watermarks include document security and counterfeit prevention for printed materials. In such applications, the presence of hard to re-produce (e.g., hard to copy) watermarks establishes authenticity of the printed material.
There is a considerable amount of prior art describing various digital watermarking techniques, systems and applications. Watermarking techniques described in the literature include methods of manipulating the least significant bits of the host signal in time or frequency domains, insertion of watermarks with an independent carrier signal using spread spectrum, phase, amplitude or frequency modulation techniques, and insertion of watermarks using a host-dependent carrier signal such as feature modulation and informed-embedding techniques. Most embedding techniques utilize psycho-visual or psycho-acoustical analysis (or both) of the host signal to determine optimal locations and amplitudes for the insertion of digital watermarks. This analysis typically identifies the degree to which the host signal can hide or mask the embedded watermarks as perceived by humans.
In most digital watermarking applications, the embedded watermarks must be able to maintain their integrity under various noise and distortion conditions that may affect the multimedia content. These impairments may be due to various signal processing operations that are typically performed on multimedia content such as lossy compression, scaling, rotation, analog-to-digital conversion, etc., or may be due to noise and distortion sources inherently present in the transmission and/or storage channel of multi-media content. Examples of this type of noise include errors due to scratches and fingerprints that contaminate data on optical media, noise in over-the-air broadcasts of audio-visual content, tape noise in VHS tapes, everyday handling of currency notes, and the like. Typically, increased robustness of embedded watermarks may be obtained at the expense of reduced transparency of the watermark.
The security of digital watermarks is another aspect of watermarking systems. In certain applications such as proof of ownership, source authentication, piracy tracing, access control of copyrighted content, and the like, it is essential that embedded watermarks resist intentional manipulations aimed at detecting the presence of watermarks, deciphering the data carried by the watermarks, modifying or inserting illegal values (forgery), and/or removing the embedded watermarks. To this end, many watermarking systems employ a secret key to enable embedding and subsequent extraction of the watermarks. These systems should be distinguished from cryptographic systems where a secret key is used to prevent unauthorized access and/or modification of the information but are not designed to prevent the detection of the presence or the removal of the encrypted information. Such cryptographic systems, depending on the length of the key and the complexity involved in breaking the key, could theoretically guarantee security of encrypted digital data for most practical situations. Indeed cryptography can be used to protect against unauthorized reading or forgery of watermark data, but it fails to provide protection against other types of attacks that are aimed at preventing the legitimate users from detecting or extracting the embedded watermarks altogether. By the way of example and not by limitation, these attacks include synchronization attacks, replacements attacks and noise attacks that modify the composite host and watermark signal in such a way to obscure or damage the embedded watermarks beyond recognition. More details on possible attacks will be presented below.
Designing a watermarking system requires reaching the proper balance between transparency (imperceptibility), robustness and security requirements of the system. A fourth requirement is the watermark payload capacity. This requirement depends on the specific application of the watermarking system. Typical applications range from requiring the detection of only the presence of watermark (i.e., single-state watermark) to requiring a few tens of bits of auxiliary information per second. In the latter case, the embedded bits may be used to carry identification and timing information such as serial numbers and timestamps and metadata such as captions, artists names, purchasing information, and the like.
A fifth factor in designing practical watermarking systems is computational costs of the embedding and/or extraction units. This factor becomes increasingly important for consumer electronic devices or software utilities with limited silicon real estate or computational capabilities. This factor is strongly related to the application at hand. For example, watermarks for forensic tracing of piracy channels, such as those that embed different codes in each copy of content distributed over the Internet, may require a simple embedder but a complex and costly forensic extractor. On the other hand, copy control systems designed to prevent unauthorized access to multimedia content, for example, in consumer electronic devices, may tolerate a sophisticated embedder but require a simple and efficient extractor.
The sixth important factor in designing a practical watermarking system is the probability of false detections. Again, this requirement varies depending on the application at hand. In certain applications, such as copy control, the probability of false detections must be very low since executing a restrictive action on a legally purchased content is bound to frustrate users and have negative implications for device manufacturers and/or content providers. On the other hand, in broadcast monitoring systems where the frequency of broadcast content is measured to generate royalty payments or popularity charts, much higher false detection rates may be tolerated since the presence of a few false detections may have very little effect on the final outcome of the counts.
The prior art systems, at best, use an ad-hoc approach for designing watermarking systems that happen to have certain collection of features, which are then mapped onto various applications in search of a good match. These systems also fail to systematically analyze security threats and provide answers to different threat scenarios. For example, U.S. Pat. No. 5,889,868 (Moskowitz, et. al.) discusses randomizing the insertion locations of watermarks within the content signal as well as varying the embedding algorithm throughout the content. But there are no enabling embodiments that describe how this randomization may take place and how this would affect a watermarking system's design parameters. This reference also merely states that at any given location of a content one or another embedding technique may be used but it fails to discuss simultaneous utilization of embedding technologies. It also fails to discuss joint configuration of embedders and extractors in order to vary levels of robustness/security/transparency/cost. In another prior art system as disclosed by D. Kirovski, et. al., in “Multimedia Content Screening Using a Dual Watermarking and Fingerprinting System”, Tech. Rep. MSR-TR-2001-57, Microsoft Research (June 2001) discloses a technique in which the host content is embedded in a conventional way (e.g., using a spread spectrum technique) using a secret watermarking key (SWK). The detection key for each detector, however, is different from SWK. The individualized detection key is generated by adding noise to SWK. Since detection is done via correlation, the noise-contaminated detection key should still produce the desired correlation value if there are no other significant (additional) impairments present. To build up immunity against additional impairments and more aggressive attacks, the length of the spreading sequence may be increased to compensate for the robustness penalty incurred due to non-optimum detection key. The techniques discussed in this prior art, however, are different from the present invention in many ways. First, the embedding is done in a conventional way so the variations in embedding space as well as the relative size of embedding space to the detection space are not considered. Second, detection keys constitute a degraded version of the embedder key; this produces a degraded correlation value during the detection process. In the present invention, however, individual detection keys are not generated by adding noise to the embedder key and the correlation value in the detection process is not degraded. Further, this reference also fails to discuss how the robustness/security/transparency needs of the watermarking system can be addressed using a systematic design approach that is suitable for a multitude of applications and needs.
These and other shortcomings of the prior art systems are addressed by the methods and apparatus of the present invention.