The protection of digital content, such as for example music, films, photographs, video games, against the creation and distribution of illegal copies is a major stake for the media and entertainment industries, in particular for the suppliers of multimedia content and copyright holders.
For this purpose, various technical solutions are known to fight against the creation and distribution of illegal content. Solutions based on data encryption prevent the creation and distribution of illegal copies properly speaking. Dissuasive solutions based on the traceability of legal copies make it possible to identify the sources of the illegal copy. The field of the invention falls within this latter category of dissuasive solutions.
Identifying sources at the origin of an illegal copy have many applications. For example, an video on-demand server distributes personal copies of the same content to different customers. Certain dishonest customers, called pirates, illegally redistribute a copy of this content for example over a P2P (“Peer-To-Peer”) network. The copyright holder wishes to identify the pirates. For this, the video on-demand server inserts a unique identifier into each of the copies via a video watermarking technique which produces many different copies although they appear to be identical. The identifier contained in the illegal copy as such makes it possible to identify the source of this illegal copy and therefore the pirate. However, in order to avoid being recognised, a group of pirates can alter the identifier by forming an illegal copy by mixing their various copies: this is the collusion of copies. Finally, this same group of pirates can, in particular by compressing the illegal copy with a loss, attempt to introduce errors into the identifier of the illegally-redistributed copy and as such have an innocent accused, or conceal the identity of the pirates.
For this purpose, it is known that the identifier inserted into the copy via watermarking is a sequence of symbols of an anti-collusion code. Cryptologists, such as D. Boneh and J. Shaw in “Collusion-secure fingerprinting for digital data” (in “IEEE Transactions on Information Theory” volume 44, pages 1897-1905, September 1998) have demonstrated the existence of an optimal code of minimum length making it possible by decoding the mix of a finite number of sequences of the code, to identify the subset of the original sequences used for the collusion regardless of the collusion strategy used to create the copy.
Such a known optimal and widely used code was proposed by Tardos in 2003 in “Optimal probabilistic fingerprint codes” (in “Proc. of the 35th annual ACM symposium on theory of computing”, pages 116-125, San Diego, Calif., USA, 2003. ACM). This probabilistic code meets the performance criteria of a decoding that is uniformly effective regardless of the collusion strategy used. It is characterised by its length which depends on the number of users (corresponding to the number of sequences in the code), the maximum number of dishonest users (corresponding to the number of sequences in the code for which identification will be sought), the number of symbols in the alphabet, the probability of a false alert (the fact of accusing a user who is not a part of the collusion), and the probability of miss (the fact of not identifying a user who is a part of the collusion).
Philips has demonstrated (in “Tardos fingerprinting is better than we thought” by B. Skoric, T. Vladimirova, M. Celik, and J. Talstra, “IEEE Transactions on Information Theory” volume 54, pages 3663-3676, August 2008) that if it is sought to retain the probability of a false alert under a certain threshold ε, the length of the code must be greater than 2π2c2 log(nε−1) for a binary code such as that of Tardos where c represents the number of dishonest customers and n the total number of customers. Philips (in “Symmetric Tardos fingerprinting codes for arbitrary alphabet sizes” by B. Skoric, S. Katzenbeisser and M. Celik, “Designs, Codes and Cryptography”, 46(2):137-166, February 2008) also addressed the generalisation of the Tardos code to a code designed on an alphabet of arbitrary size with the idea of minimising the length of the code.
The technical problem with these solutions is the length of the code. On the one hand, as the number of symbols that can be concealed in multimedia content thanks to watermarking techniques is limited, the length of the code must remain compatible with the size of the content to be identified. On the other hand, as the complexity of the decoding is directly linked to the length of the code, it must remain minimal in order to limit the means in terms of calculating power and memory size of the decoding device.
In addition, the anti-collusion codes must also handle the increasing number of users of services in particular video on-demand services and potentially dishonest users. Finally, the codes must also be resistant to errors introduced by noise in the transmission or by transformation of the content.