The protection of digital contents, such as for example music, films, photographs and video games against the creation and distribution of illegal copies is a significant issue for the media and entertainment industries, particularly for the providers of multimedia content and copyright owners.
For this purpose, various technical solutions are known to overcome the creation and distribution of illegal contents. Solutions based on the encrypting of data preventing the actual creation and distribution of illegal copies. Dissuasive solutions based on the traceability of legal copies enabling the sources of the illegal copy to be identified. The domain of the invention falls into this last category of dissuasive solutions.
The identification of sources at the origin of an illegal copy has numerous applications. For example, a video-on-demand server distributes personal copies of a same content to different clients. Some dishonest clients, called pirates, illegally redistribute a copy of this content for example on a P2P (Peer-To-Peer) network. The copyrights owner would like to identify the pirates. To do this, the video-on-demand server inserts a unique identifier in each of the copies using a video watermarking technique thus producing so many different copies although they appear to be identical. The content identifier in the illegal copy thus enables the source of this illegal copy, and therefore the pirate, to be identified. However in order not to be recognised, a group of pirates can alter the identifier by constituting an illegal copy by mixing their different copies: this is collusion of copies. Finally, this same group of pirates can, notably by compressing the illegal copy with losses, attempt to introduce errors in the identifier of the illegally distributed copy and thus either have innocent people accused, or mask the identity of the pirates.
For this purpose, it is known that the identifier inserted into the copy by watermarking is a sequence of symbols of an anti-collusion code. Cryptographers, such as D. Boneh and J. Shaw in “Collusion-secure fingerprinting for digital data” (in “IEEE Transactions on Information Theory” volume 44, pages 1897-1905, September 1998) have shown the existence of an optimal code of minimum length enabling by decoding of the mix of a finite number of code sequences, to identify the subset of original sequences used for the collusion whatever the collusion strategy used to create the copy.
Such an optimal code known and widely used was proposed by Tardos in 2003 in “Optimal probabilistic fingerprint codes” (in “Proc. of the 35th annual ACM symposium on theory of computing” pages 116-125, San Diego, Calif., USA, 2003. ACM). This probabilistic code responds to the performance criteria of a uniformly efficient decoding system whatever the collusion strategy employed. It is characterized by its length that depends upon the number of users (corresponding to the number of sequences in the code), the maximum number of dishonest users (corresponding to the number of sequences in the code that will be attempted to be identified), the number of symbols in the alphabet, the false alarm probability (the fact of accusing a user who was not party to the collusion) and the miss probability (the fact of not identifying a user who was party to the collusion).
Philips has shown (in “Tardos fingerprinting is better than we thought” by B. Skoric, T. Vladimirova, M. Celik, and J. Talstra, “IEEE Transactions on Information Theory” volume 54, pages 3663-3676, Août 2008) that if we want to maintain the probability of a false alarm below a certain threshold ε, the length of code must be greater than 2π2c2 log(nε−1) for a binary code such as that of Tardos where c represents the number of dishonest clients and n the number of clients in total. Philips (in “Symmetric Tardos fingerprinting codes for arbitrary alphabet sizes” by B. Skoric, S. Katzenbeisser and M. Celik, “Designs, Codes and Cryptography”, 46(2):137-166, February 2008) also addressed the generalisation of the Tardos code to a code designed on an alphabet of arbitrary size in keeping with the idea of minimizing the length of code.
The technical problem of these solutions is the length of the code. On the one hand, the number of symbols that can be hidden in a multimedia content using watermarking techniques being limited, the length of the code must remain compatible with the size of the content to be identified. On the other hand, the complexity of the decoding being directly related to the length of the code, it must remain minimal to limit the calculating power and memory size means of the decoding device. Moreover, the length of anti-collusion codes increases to handle the growing number of services users particularly of video-on-demand services and potentially of dishonest users.
In a first patent application filed the same day as the present application by the same applicant, an iterative method for decoding such a code comprising a step of estimation of the collusion strategy associated with a step of identification of sequences present in the collusion was proposed. This method presents the advantage of improving the code decoding performances and of being resistant to errors introduced by noise in the transmission or by transformation of the content. The decoding performance enables limiting of the length of code used to identify the sequences in the collusion and thus provides a solution to the technical problem. However, this decoding method, due to its iterative method, requires significant calculating power and memory size means. A second technical problem is therefore the complexity of iterative decoding that can render it impossible to use for a code supporting a great number of sequences.