The present invention relates to a system for embedding and detecting digital watermarks.
Digital watermarks have been proposed and used for copyright protection of signals such as audio, video, and images. The primary objective is to hide an auxiliary signal within a host signal in such a way that it is substantially imperceptible and difficult to remove without damaging the host signal. The auxiliary signal may carry some information that is helpful, e.g., in a copyright protection mechanism. For example, it can communicate xe2x80x9cno copy allowedxe2x80x9d to a compliant copying device, and/or it can carry a signature code that identifies the rightful owner, author, and/or content. The signature code can be used to monitor usage of copyrighted material, resolve ownership disputes, keep track of royalties, etc.
Further, digital watermarks can be used to distinguish different copies of the same host signal distributed to different users in legitimate transactions. The primary objective is, if a copy is pirated and redistributed illegally, to trace down the user who obtained it in the legitimate transaction and, presumably, to prosecute him for breach of copyright laws. The secondary objective is to deter casual copying, e.g., among small groups of people.
For example, a pirate can order music over the Internet from a legitimate distributor, directly or using a proxy. Then, the pirate can resell it or redistribute it for free using the Internet or other means. A similar scenario can occur in the distribution of video or other images (e.g., still photos, computer graphics and games, etc.) over the Internet, or in the distribution of video or music over xe2x80x9cpay-per-viewxe2x80x9d channels in a cable or satellite TV network.
Moreover, in the Internet distribution business, the host signal (music, image or video) is typically stored and delivered in a compressed form (e.g., MP3 format in music). This means that a typical watermark embedding process requires decompression, embedding, and then recompression before transmission. Obviously, this imposes additional processing requirements, and adds more noise to the host signal in the process.
Furthermore, many watermark embedding processes are subject to collusion attacks. A major distinction between Transaction Code Embedding (TCE), also sometimes referred to as xe2x80x9cfingerprintingxe2x80x9d, and embedding of other messages, such as content ID, owner ID, copy control codes, etc., is that, with TCE, a pirate can design special kind of attacks based on the fact that TCE embeds different auxiliary signals into the same content.
For example, by simply subtracting two copies with different watermarks, the pirate obtains the difference of the pure watermarks, which can help him analyze the hiding technique and devise a sophisticated attack. Secondly, the pirate can average a number of copies to weaken individual watermarks, make them interfere, and eventually make them undetectable. Similarly, the pirate can cut portions of different copies and splice them together. The resulting signal has segments of different watermarks spliced together, which is hard to use to identify the pirate.
One existing technique to fight collusion attacks, described in International Publication no. WO 99/39344, published Aug. 5, 1999 to J. M. Winograd et al., is based on a random phase-modulation technique that precedes the watermark embedding stage. The random phase modulation, although imperceptible to a user, makes copies of the same original sufficiently different so that the collusion attack cannot work. For example, averaging multiple copies of a piece of music produces phase cancellations that make the resulting signal annoying to the listener.
However, this technique does not address the issue of the processing (computational) load of TCE. To the contrary, it proposes an additional processing stage that can only increase the processing load.
Another technique, described in International Publication no. WO 99/62022, published Dec. 2, 1999, to D. Wong and C. Lee, greatly reduces the real-time processing required for TCE by preprocessing a host signal to provide two uncompressed copies, one containing segments with an embedded binary xe2x80x9c0xe2x80x9d, while the other contains corresponding segments with embedded binary xe2x80x9c1xe2x80x9d. Successive segments are selected from one of the two copies to provide a time-multiplexed composite host signal with an embedded binary data that corresponds to the transaction code.
However, this technique does not address the security issue and collusion attacks. Moreover, splicing of the segments may result in perceptible artifacts. Additionally, this technique does not address the issue of combining two copies that are saved in a compressed form.
Accordingly, it would be desirable to provide a watermark embedding and detection system that addresses the above and other concerns.
The system should not require decompression and recompression. It is also desirable that the same technique can be applied to different compression/decompression algorithms (such as MPEG, AAC, AC3, ATRAC, etc. in music).
Furthermore, the system should not be overly computationally intensive and costly since embedding is performed frequently (into every copy, not into every original). Although some complex algorithms can make sense for embedding a high quality, high security watermark in a production studio, it may well be too costly to run it on the fly in the Internet distribution of copyrighted content. The embedding and detection system should not be too costly for such applications.
Moreover, the system should thwart collusion attacks, and should enable identification of an illegitimate distributor of protected content, or, more precisely, the original recipient of the content. Additionally, the system should be easily implementable in Internet distribution and other distribution modes.
Also, the system should avoid perceptible artifacts.
The present invention provides a transaction code embedding and detection system that provides the above and other advantages.
The present invention relates to a system for embedding and detecting digital watermarks.
Each copy of content to be protected is labeled with a unique code referred to as a xe2x80x9ctransaction codexe2x80x9d. Using the transaction code, it is possible to identify the user that obtained a legitimate copy of the content but distributed it illegally, and prosecute him, or at least blacklist him to prevent his further purchases. Thus, legal action can be taken against the user even when the distributor to the user is immune from legal action, e.g., due to being a foreign-based company.
In one aspect of the invention, a method for embedding watermarks in a host signal, includes the step of forming watermarked copies of the host signal with at least one different transaction watermark and at least one common watermark embedded therein. The host signal may be a music piece (e.g., song) that is to be protected. Thus, a given copy contains transaction watermarks with the same symbol values. Portions of the different watermarked copies are assembled (e.g., using multiplexing) according to a transaction code to form an output signal with transaction watermarks that correspond to the transaction code.
The output signal is then distributed to a user, who can be subsequently identified if the content is re-distributed in an unauthorized manner.
In a further aspect of the invention, a method is presented for analyzing a watermarked signal e.g., which is suspected to have been re-distributed illicitly and modified using a collusion attack. In possible collusion attacks, different copies are cut-and-spliced together or averaged. The method includes the step of recovering a plurality of transaction watermarks of the watermarked signal that define respective symbols thereof. At least one hypothesis transaction code is provided that defines respective symbols thereof. The symbols of the watermarked signal are compared to corresponding symbols of the hypothesis transaction code to determine a correspondence therebetween. Based on the correspondence, a probability is determined that the hypothesis transaction code matches a transaction code that is associated with at least some of the respective symbols of the transaction watermarks.
For example, for binary codes, it may be determined that m bits of the transaction watermarks match out of a total of n bits of the hypothesis transaction code. The invention provides a surprisingly high degree of certainty that the hypothesis transaction code matches (or doesn""t match) the transaction code of the transaction watermarks even when there is only a partial match of the symbols (e.g., bits).
A number of possible hypothesis transaction codes can be compared to the symbols of the watermarked signal until a likely match is found. Or, if certain transaction codes are suspected (e.g., based on their association with suspected persons), those codes can be compared first to prevent unnecessary computations.
The transaction code symbols can be binary or other M-ary symbols.
In another aspect of the invention, a method is provided for analyzing content (such as a music piece) that is distributed via a plurality of distribution points (such as web sites) in a network (such as the Internet). The distribution points may be suspected for unauthorized re-distributions of the content, or a check can be made periodically of relevant web sites as a matter of policy by the copyright holder of the content, or its agent. The content is obtained from the distribution points, e.g., using a web crawler. For each of the distribution points, it is determined whether the content thereof includes a common watermark that corresponds to a predetermined common watermark code. The content may contain no common watermark, or it may contain a common watermark that does not corresponds to the predetermined common watermark code.
However, for the content that includes the common watermark at issue, a transaction watermark is retrieved from the content, and a transaction code associated with the transaction watermark is identified.
In this case, the identified transaction code can be compared with at least one hypothesis transaction code to determine a correspondence therebetween, and a probability can be determined that the identified transaction code matches the hypothesis transaction code.
The presence of the common code in the content tells us that the associated distributor is an unauthorized distributor (assuming the authorized distributor has not just visited its own web site). The transaction code tells us how the content got to the unauthorized site (i.e., by associating the transaction code with an original purchaser or user of the content who subsequently redistributed it, and is presumably without authorization).
Of course, it is possible that the original purchaser was authorized in distributing the content to a second person, and the second person redistributed the content illegally. But, in any case, the start of the chain of distribution can be tracked down.
Corresponding apparatuses are also presented.