Modern media content distribution systems such as mobile video transmission systems are becoming increasingly popular. Bitstream scalability is a desirable feature in such systems. An encoded media bitstream is generally called scalable when parts of the bitstream can be removed so that the resulting sub-bitstream is still decodable by a target decoder. The media content of the sub-bitstream can be reconstructed at a quality that is less than that of the original bitstream, but still high when considering the resulting reduction of transmission and storage resources. Bitstreams that do not have these properties are also referred to as single-layer bitstreams.
Scalable Video Coding (SVC) is one solution to the scalability needs posed by the characteristics of video transmission systems. The SVC standard as specified in Annex G of the H.264/Advanced Video Coding (AVC) specification allows the construction of bitstreams that contain scaling sub-bitstreams conforming to H.264/AVC. H.264/AVC is a video compression standard equivalent to the Moving Pictures Expert Group (MPEG)-4 AVC (MPEG-4 AVC) standard.
The SVC standard encompasses different scalability concepts as described, for example, in H. Schwarz et al., “Overview of the Scalable Video Coding Extension of the H.264/AVC standard”, IEEE Transactions on Circuits and Systems for Video Technology”, Vol. 17, No. 9, September 2007. For spatial and quality bitstream scalability, i.e. the generation of a sub-bitstream with lower spatial resolution or quality than the original bitstream, Network Abstraction Layer (NAL) units are removed from the bitstream when deriving the sub-bitstream. In case of spatial and quality bitstream scalability, inter-layer prediction, i.e., the prediction of the higher spatial resolution or quality bitstream based on information contained in the lower spatial resolution or quality bitstream, is used for efficient encoding. For temporal bitstream scalability, i.e., the generation of a sub-bitstream with a lower temporal sampling rate than the original bitstream, complete access units are removed from the bitstream when deriving the sub-bitstream. An access unit is defined as a set of consecutive NAL units with specific properties. In the case of temporal bitstream scalability, high-level syntax and inter prediction reference pictures in the bitstream are constructed accordingly.
In the SVC standard, the sub-bitstream having a lower temporal sampling rate, lower spatial resolution or lower quality is referred to as Base Layer (BL) sub-bitstream, while the higher temporal sampling rate, higher spatial resolution or higher quality sub-bitstream is referred to as Enhancement Layer (EL) sub-bitstream. In scenarios with multiple sub-bitstreams of, for example, different higher spatial resolutions, two or more EL sub-bitstreams may be provided in total. Each sub-bitstream can be interpreted as constituting a separate media layer.
An image of an SVC video image sequence is represented as so-called “frame” (i.e., as an encoded representation of this image). Each SVC sub-bitstream comprises a sequence of so called SVC “sub-frames”. Each SVC sub-frame constitutes either a full SVC frame or a fraction of a SVC frame. In other words, each SVC frame is either represented as a single data item (i.e., one BL “sub-frame” or one EL “sub-frame”) or is sub-divided in at least two separate data items, i.e., in one BL “sub-frame” containing only the BL information associated with the respective frame and (at least) one EL “sub-frame” containing the EL information associated with the respective frame.
The scalability feature introduced by the SVC standard allows for a bitstream adaptation dependent on, for example, decoder capabilities, display resolutions and available transmission bit rates. If only the BL sub-frames are decoded, the video content can be rendered for example at a basis resolution or quality (e.g., at Quarter Video Graphics Array, or QVGA, resolution). If, on the other hand, both the BL and the EL sub-frames are decoded, then the video content can be rendered at a higher resolution or quality (e.g., at VGA resolution or High-Definition (HD) resolution).
In order to control the distribution and consumption of media content, for example, media content distributed based on the above described SVC standard, the media content can be protected with a Digital Rights Management (DRM) system. Under the DRM framework, content is securely distributed to and consumed by authorized recipients, for example authenticated user devices, per the usage right expressed by the content issuer (other names for content issuer include content provider, content owner, content distributor, and the like). The DRM framework is independent of content formats, operating systems, communication channels, and runtime environments. Content protected based on DRM can be a wide variety of media content like documents, images, ringtones, music clips, video clips, streaming media, games, and so on.
A known DRM system for content and service protection is included in Open Mobile Alliance (OMA) Mobile Broadcast Services Enabler Suit (BCAST). DRM components of OMA BCAST are described in document “Service and Content Protection for Mobile Broadcast Services”, Approved Version 1.0, 12 Feb. 2009 by OMA. Therein, DRM Profile and Smartcard Profile are described as two main systems for providing service and content protection. OMA BCAST uses a four-layer model key management architecture for service and content protection.
FIG. 1 shows a block diagram of encrypted media content distribution between a server and a client. Encryption and decryption is carried out according to the OMA BCAST standard. The OMA BCAST four-layer model key management architecture is based on layers L1 to L4. In Layer L1, trust is established between server and client based on a Subscriber Management Key (SMK). SMK is a user key that is provided to the client based on the Generic Bootstrapping Architecture (GBA) protocol. GBA is described in document “Generic Authentication Architecture, Generic Bootstrapping Architecture (Release 6)” 3rd Generation Partnership Project, Technical Specification 3GPP TS 33.220. Based on the trusted relation between server and client, it is ensured that only trusted clients get access to Service Encryption Keys (SEK) in subscription management layer L2. SEK is a long-term key that is provided through a Long Term Key Message (LTKM) to the client. Based on the SMK, the SEK key can be encrypted and decrypted in layer L2. Similar to layer L2, in traffic key layer L3, a Traffic Encryption Key (TEK) is encrypted and decrypted based on the SEK. The TEK is a short-term key and is delivered through a Short Term Key Message (STKM). Although the term layer is used herein for layers L1 to L4 and the SVC layers, different layers are concerned.
SEK and TEK are distributed based on the Multimedia Internet KEYing (MIKEY) protocol, which is described in document “MIKEY: Multimedia Internet KEYing”, RFC 3830, August 2004 by the Internet Engineering Task Force (IETF). One difference between SEK and TEK is that TEK typically changes more frequently than SEK. From this difference how long the TEK is valid in comparison to the validity of the SEK, the terms long-term key and short-term key are derived. In content layer L4, the media content is decrypted based on the TEK.
Content issuers providing DRM protected media content to clients (i.e., users) have an interest to know, which amount of media content is consumed by the client. Such data can be used by the content issuer to charge the user based on the consumed amount of media content. Moreover, in case of scaled media content distribution (for example, in accordance with the SVC standard), the content issuer has an interest to know, media content of which resolution or quality has been consumed by the client. Such data can be used by the content issuer to charge the user with different rates based on the consumed media content resolution or quality. Furthermore, since content issuers and network operators (which are physically delivering the media content via its networks to the client) are often unrelated companies, the content issuer has an interest to check whether the network operator distributes the media content with a guaranteed resolution or quality to the client. Since user charging and checking the network operator's media content distribution are critical for the content issuer, determination of encrypted media content usage is desired to be carried out in a tamper-proof manner.
However, no tamper-proof solution for determining usage of encrypted media content exists.
Document “Service and Content Protection for Mobile Broadcast Services”, Approved Version 1.0, 12 Feb. 2009, by OMA discloses in Chapter 6.6.7.8 to transmit a “consumption_reporting_flag” in a LTKM. This flag can be used to determine SEK usage. However, since SEKs typically change in an infrequent manner, no conclusion on media content usage can be drawn based on the consumption_reporting_flag.
Document WO 2004/017560 A1 concerns a technique for monitoring digital content provided from a content provider over a network. However, this document provides no disclosure regarding how usage of digital content can be monitored in a tamper-proof manner.