A terminal is used by a client of the service to access a content in order to play it. Accessing a multimedia content here means loading it into memory and lifting or removing its protection, on the fly while it is being received, or on a storage medium on which it has previously been stored, storing it, or making any other use thereof offered by the service for supplying protected multimedia content.
The contents supplied are:
audiovisual contents, for example television programmes,
audio-only contents, for example a radio programme, or
more generally, any digital content containing video and/or audio such as a computer application, a game, a slide show, an image or any set of data.
Among these contents, attention will be focused more particularly hereinbelow on the so-called temporal contents. A temporal multimedia content is a multimedia content, the playing of which is a succession in time of sounds, in the case of an audio temporal content, or of images, in the case of a video temporal content, or of sounds and of images temporally synchronized with one another in the case of an audiovisual temporal multimedia content. A temporal multimedia content can also include interactive temporal components temporally synchronized with the sounds or the images.
To be supplied, such a content is first of all encoded, that is to say compressed, so that its transmission requires a lesser bandwidth.
To this end, the video component of the content is encoded according to a video format, such as, for example, MPEG-2. The interested reader will be able to find a complete presentation of this format in the document published by the International Organization for Standardization under the reference ISO/IEG 13818-2:2013 and the title “Information technology Generic coding of moving pictures and associated audio information—Part 2: Video”. Numerous other formats, such as MPEG-4 ASP, MPEG-4 Part 2, MPEG-4 AVC (or Part 10), HEVC (High Efficiency Video Coding), or WMV (Windows Media Video) can alternatively be used, and rely on the same principles. Thus, all of the below applies also to these other video formats which rely on the same principle as the MPEG-2 coding.
The MPEG-2 coding involves general data compression methods. For the fixed images, it notably exploits the spatial redundancy internal to an image, the correlation between the neighbouring points and the lesser sensitivity of the eye to the details. For the moving images, it exploits the strong temporal redundancy between successive images. The exploitation thereof makes it possible to code certain images of the content, here said to be deduced, with reference to others, here called source images, for example by prediction or interpolation, such that their decoding is possible only after that of said source images. Other images, here called initial images, are coded without reference to such source images, that is to say that they each contain, when they are coded, all the information necessary to their decoding and therefore that they can be completely decoded independently of the other images. The initial images are thus the obligatory point of entry upon accessing the content. The resulting coded content does not therefore include the data necessary to the decoding of each of the images independently of the others, but is made up of “sequences” according to the MPEG-2 terminology. A sequence produces the compression of at least one “group of images” (for GOP, standing for Group Of Pictures, in MPEG-2). A group of images is a series of consecutive images in which each image is:
either initial and source for at least one deduced image contained in the same series of consecutive images,
or deduced and such that each of the source images necessary for its decoding belongs to the same series of consecutive images.
A group of images does not contain a series of consecutive images that is smaller and that has the same properties as above. The group of images is thus the smallest part of content that can be accessed without having to first decode another part of this content.
A sequence is delimited by a “header” and an “end”, each identified by a first specific code. The header comprises parameters which characterize properties expected of the decoded images, such as the horizontal and vertical sizes, ratio, frequency. The standard recommends repeating the header between the groups of images of the sequence, so that its successive occurrences are spaced apart by approximately a few seconds in the coded content.
For example, a group of images most commonly comprises more than 5 to 10 images and, generally, fewer than 12 or 20 or 50 images. For example in a system with 25 images per second, a group of images typically represents a playing time greater than 0.1 or 0.4 seconds and, generally, less than 0.5 or 1 or 10 seconds.
A temporal multimedia content can comprise a plurality of video components. In this case, each of these components is coded as described above.
The audio component of the content is moreover coded according to an audio format such as MPEG-2 Audio. The interested reader will be able to find a complete presentation of this format in the document published by the International Organization for Standardization under the reference ISO/IEC 13818-3:1998 and the title: “Information technology—Generic coding of moving pictures and associated audio information—Part 3: Sound”, Numerous other formats, such as MPEG-1 Layer III, better known as MP3, AAC (Advanced Audio Coding), Vorbis or WMA (Windows Media Audio), can alternatively be used, and rely on the same principles. Thus, all of the below applies equally to these other audio formats which rely on the same principles as the MPEG-2 Audio coding.
The MPEG-2 Audio coding obeys the same principles described above for that of a video temporal content. The resulting coded content is therefore, similarly, made up of “frames”. A frame is the audio analogue of a group of images in video. The frame is therefore notably the smallest part of audio content that can be accessed without having to decode another part of this audio content. The frame also contains all the information useful for its decoding.
A frame typically comprises more than 100 or 200 samples each coding a sound and, generally, fewer than 2000 or 5000 samples. Typically, when it is played by a multimedia appliance, a frame lasts longer than 10 ms or 20 ms and, generally, less than 80 ms or 100 ms. For example, a frame comprises 384 or 1152 samples each coding a sound. Depending on the signal sampling frequency, this frame represents a playing time of 8 to 12, or 24 to 36 milliseconds.
A temporal multimedia content can comprise a plurality of audio components. In this case, each of these components is coded as described above.
The coded components of the content, also qualified as basic bitstreams, are then multiplexed, that is to say, in particular, temporally synchronized, then combined into a single bitstream, or datastream.
Such a content, notably when it is the object of rights such as copyright or similar rights, is supplied protected by a multimedia content protection system. This system makes it possible to ensure that the conditions for accessing the content which evolve from these rights are observed.
It is then typically supplied encrypted as part of its protection by a Digital Rights Management system, or DRM. This encryption is generally performed by means of an encryption key, by a symmetrical algorithm. It is applied to the stream resulting from the multiplexing or before multiplexing, to the components of the coded content.
A DRM system is in fact a multimedia content protection system. The terminology of the field of digital rights management systems is thus used hereinbelow in this document. The interested reader will, for example, be able to find a more comprehensive presentation thereof in the following documents:
concerning the general architecture of a DRM system: DRM Architecture, Draft version 2.0, OMA-DRM-ARCH-V2_0-20040518-D, Open Mobile Alliance, 18 May 2004,
more particularly concerning the licences: DRM Specification, Draft version 2.1 OMA-TS-DRM-DRM-V2_1-20060523-D, Open Mobile Alliance, 23 May 2006.
In such a digital rights management system, the obtaining of a licence enables a terminal to access the protected multimedia content.
Of well-known structure, such a licence comprises at least one right of access necessary for this terminal to access the content, and typically a temporal validity criterion. The right of access typically comprises a key, called content key, necessary for decrypting the multimedia content protected by a symmetrical decryption algorithm. The temporal validity criterion characterizes the period of time over which the licence can be used. It typically consists of one or more time intervals. Outside of these time intervals, the licence does not allow the access to the content.
The content key is generally inserted into the licence in the form of a cryptogram obtained by encryption of the content key with an encryption key, called “terminal key”, specific to the terminal.
To access the content, the terminal extracts the content key from the licence, by decrypting its cryptogram using its terminal key.
The terminal then descrambles the content by means of the content key duly extracted from the licence, thus lifting the protection. Then, the terminal decodes the descrambled content.
The terminal thus generates a free-to-air or free access multimedia stream comprising at least one temporal series of video sequences or of groups of images, or of audio frames. This multimedia stream is suitable for being played by a multimedia appliance connected to this terminal. Here, “free-to-air” describes the fact that the multimedia stream does not need to be descrambled to be played, by a multimedia appliance, in a manner that is directly perceptible and intelligible to a human being. “Multimedia appliance” further describes any device suitable for playing the free-to-air multimedia stream, such as, for example, a television or a multimedia player.
In order to improve the protection thereof, the content is supplied, by the system supplying protected multimedia content, split into a plurality of successive content segments that are individually protected by the digital rights management system. These segments are therefore ordered temporally relative to one another.
More specifically, a segment is a restricted part of the free-to-air multimedia stream whose playing time is shorter than that of the entire multimedia stream. A segment therefore comprises a restricted part of each video and audio component of the free-to-air multimedia stream, whose playing time is shorter than that of the entire multimedia stream. These restricted component parts are synchronized in the stream to be played simultaneously. A segment therefore comprises the restricted part of the temporal series of video sequences or of groups of images, or of audio frames producing the coding of this restricted component part of the free-to-air multimedia stream. This restricted part is made up of a plurality of successive video sequences or groups of images or audio frames. Successive should be understood here to mean being immediately followed, that is to say without being separated, in the temporal unfolding of the content, by other video sequences or groups of images or audio frames belonging to another segment. Typically, a segment comprises more than 10, 100, 1000 or 10 000 groups of successive video images of one and the same coded video component of the stream, or more than 10 to 100 times more successive audio frames of one and the same coded audio component of the stream.
Each segment is encrypted by the symmetrical algorithm, as part of its protection by the digital rights management system, by means of a specific content key. This content key is said to be “specific” in that it is only used to encrypt this segment out of all the segments of the multimedia content. The obtaining of a specific licence, including the specific content key necessary for decrypting the protected segment, enables a terminal to access this segment.
A segment is not therefore characterized by its structure, but by the specific content key used to encrypt it. A segment is the plurality of immediately successive video sequences and audio frames encrypted with one and the same specific content key.
To further improve the protection of the content, an intermediate level of encryption of the content keys is introduced. It makes it possible to change, during the temporal unfolding of the content, the encryption keys used to compute the cryptograms of the specific content keys conveyed in the specific licences.
To this end, the segments are grouped together in blocks of segments. Each block contains only a restricted part of the segments of the content. Typically, each block contains at least one segment and, generally, a plurality of successive segments. Successive should be understood here to mean being immediately followed, that is to say without being separated, in the temporal unfolding of the content, by segments not belonging to the block concerned. A content key encryption key, called intermediate key, is associated with each of these blocks. The content key necessary for decrypting a segment, is encrypted with the intermediate key associated with the block to which this segment belongs. The resulting cryptogram is then inserted into a licence, called intermediate licence, transmitted together with the segment. The intermediate licence also comprises an identifier of a licence, called “terminal licence”. The terminal licence comprises a cryptogram of the intermediate key obtained by the encryption of this intermediate key with the terminal key.
A block of segments is not therefore characterized by its structure, but by the intermediate key used to encrypt the specific content key of any segment that belongs to it. A block of segments therefore corresponds to the segments each associated with an intermediate licence in which the specific content key is encrypted with one and the same intermediate key.
In such a system, a terminal therefore receives, together with an encrypted segment, an intermediate licence comprising the cryptogram of the specific content key necessary for decrypting the segment. This cryptogram has been obtained by encrypting this specific content key with an intermediate key. In order to access the segment, the terminal must first obtain the terminal licence which comprises the cryptogram of this intermediate key obtained by encrypting this key with its terminal key. The terminal obtains this terminal licence by means of the content identifier in the intermediate licence.
To use this terminal licence, the terminal must then first evaluate its temporal criterion with respect to a service date, controlled by the service operator as temporal reference of the service. This evaluation consists in determining whether the service date, typically expressed in seconds, is or is not included in the validity period of the terminal licence. The terminal must therefore know or acquire the service date.
If the result of the evaluation of the temporal criterion of the terminal licence is positive, the terminal continues using the terminal licence, notably by decrypting the cryptogram of the intermediate key that it includes, by means of its terminal key. If the result of this evaluation is negative, the terminal disables the use of the terminal licence, and notably does not decrypt the cryptogram of the intermediate key that it includes. This thus prevents use of the intermediate licence, and the access to the protected segment using the content key for which it includes the cryptogram.
The service date acquired by the terminal thus conditions its access to the block of segments to which the received segment belongs, and therefore compliance with the rights to which this block is subject. It will be understood that it is important for the service date not to be able to be modified easily by a user of the terminal. In effect, he or she could then set it to a date included in the validity period of the terminal licence, thus being exempted from the service date controlled by the operator, and from the observance of the rights to which the corresponding segment is subject.
To remedy this drawback, it has already been proposed to equip the terminals with secured local clocks, that is to say clocks which cannot be set by the user of the terminal. Such solutions are, for example, disclosed in the applications US20090006854, US20060248596 and US2010024000A1.
Numerous terminals do not however have any secured clock, that is to say any internal mechanism suitable for locally supplying a date with a guarantee that is deemed sufficient for it to be sufficiently close to the service date. Most do not in fact have a local clock, and the others have only an unsecured clock, that is to say one that is unprotected and therefore remains modifiable by the user.
To remedy this last difficulty, it has been proposed to incorporate a date server in the DRM system. A local clock, internal to the terminal but not protected, is then regularly synchronized with this date server, for example according to the network time protocol, called NTP. If the terminal does not include any local clock or if it does not want to use this local clock, then the service date is acquired from the date server, systematically each time the temporal validity criterion of a licence has to be evaluated.
This last embodiment is advantageous because, since the recall to the date server therein is systematic, it dispenses with the use of a local clock, and dictates the use of a service date controlled by the operator. However, the result thereof is generally a demand, and therefore a computation load, that are significant for the date server, which requires a lot of servers to take the load. In effect, when numerous terminals require, in a short time interval, access to contents offered by the service supplying protected multimedia content, it is therefore necessary to evaluate the temporal validity criterion of the corresponding licences for each of the terminals used. The result thereof is a significant computation load for the date server. The result thereof is also significant network traffic to the date servers. Now these significant computation and network traffic loads are likely to impair the quality of the service provided.
It is therefore particularly advantageous to reduce this load and this network traffic, while guaranteeing a high level of security of the system with respect to attempts to manipulate the service date and without the need to use a local clock in the terminal.