This section is intended to introduce the reader to various aspects of art, which may be related to various aspects of the present invention that are described and/or claimed below. This discussion is believed to be helpful in providing the reader with background information to facilitate a better understanding of the various aspects of the present invention. Accordingly, it should be understood that these statements are to be read in this light, and not as admissions of prior art.
It has long been known to protect video data by encryption, notably in conditional access television systems. FIG. 1 illustrates a traditional prior art approach for content access control. The video signal CNT is first encoded 110 using a standard compression encoder, and the resulting bit stream CNT′ is then encrypted 120 using a symmetric encryption standard (such as DES, AES, or IDEA). The encrypted bit stream [CNT′] is then received by a receiver that decrypts 130 the encrypted bit stream [CNT′] to obtain an encoded bit stream CNT′ that is decoded 140 to obtain a video signal CNT that is, at least in theory, identical to the initial video signal. In this approach, called fully layered, compression and encryption are completely independent processes. The media bit stream is processed as classical plaintext data, with the assumption that all symbols or bits in the plaintext are of equal importance.
This scheme is relevant when the transmission of the content is unconstrained, but it seems inadequate in situations where resources (such as memory, power or computation capabilities) are limited. Much research shows the specific characteristic of image and video content: high transmission rate and limited allowed bandwidth, which justifies the inadequacy of standard cryptographic techniques for such content. This has led to researchers to explore a new scheme of securing the content—named “selective encryption”, “partial encryption”, “soft encryption”, or “perceptual encryption”—by applying encryption to a subset of a bit stream with the expectation that the resulting partially encrypted bit stream is useless without the decryption of the encrypted subset.
An exemplary approach is to separate the content into two parts: the first part is the basic part of the signal (for example Direct Current, DC, coefficients in Discrete Cosine Transform, DCT, decomposition, or the low frequency layer in Discrete Wavelet Transform, DWT, decomposition), which allows the reconstruction of an intelligible, but low quality version of the original signal, and a second part that could be called the “enhancement” part (for example Alternating Current, AC, coefficients in DCT decomposition of an image, or high frequency layers in DWT), which allows the recovery of fine details of the image and reconstruction of a high quality version of the original signal. According to this new scheme, only the basic part is encrypted, while the enhancement part is sent unencrypted or in some cases with light-weight scrambling. The aim is to protect the content and not the binary stream itself.
FIG. 2 illustrates selective encryption according to the prior art. Encoding and decoding is performed as in FIG. 1. In selective encryption, the encoded bit stream CNT′ is encrypted 220 depending on selective encryption parameters 240. These parameters may, as mentioned, for example state that the only the DC coefficients or the low frequency layer should be encrypted, while the rest of the encoded bit stream CNT′ should be left unencrypted. The partially encrypted bit stream [CNT′] is then (partially) decrypted 230 depending on the selective encryption parameters 240.
As will be appreciated, selective encryption aims at reducing the amount of data to encrypt while achieving a sufficient and inexpensive security. Selective encryption of multimedia content addresses video data, audio data, still images or a combination thereof.
If compression is used, then selective encryption can be applied during compression, “in-compression”, before compression, “pre-compression”, or after compression, “post-compression”.
WO 2010/000727 and “Selective Encryption of JPEG2000 Compressed Images with Minimum Encryption Ratio and Cryptographic Security”, A. Massoudi, F. Lefebvre, C. De Vleeschouwer, F-O Devaux, IEEE describe a selective encryption method for JPEG2000 still images. The basic idea is to benefit from the fact that JPEG2000 data is uniformly distributed and that it therefore isn't necessary to encrypt an entire block of data for the protection to be efficient. If a k-bit encryption key is used, one may encrypt fewer bits and it is optimal to encrypt exactly k bits of the block. If more bits are encrypted, then a brute-force attack on the key is easier, if less is encrypted, then a brute-force attack on the encrypted part is easier, but exactly k bits falls exactly in the middle meaning that they are equally hard.
The mentioned encryption method is a post-compression scheme: the contextual arithmetic EBCOT (Embedded Block Coding with Optimal Truncation) coded data are totally (if the block length is exactly k bits) or partially encrypted.
WO 2009/090258 describes protection of a JPEG2000 bit stream. Packets are ordered according to a distortion-to-rate ratio. The transmitter then iteratively replaces the packet having the highest ratio with random data until a target distortion is achieved. In order to use the protected bit stream, the receiver requests the original packets from the transmitter and replaces the random packets with the original packets The goal is to perform selective encryption of the bit stream.
However, while the solutions work well for JPEG2000 data because the EBCOT compresses only signal data, it may be less suited for other signal formats.
For example, in H.264/MPEG-4 AVC the entropic coding is either Context-Adaptive Variable-Length Coding (CAVLC) or Context-based Adaptive Binary Arithmetic Coding (CABAC). In H.264, CABAC compresses signal data and header data. Header data are necessary for the H.264 parser to reconstruct the uncompressed data. If the CABAC data does not comply with the required format, then the parser fails and the decoder crashes.
A salient feature of H.264 is the use of a Network Abstraction Layer (NAL) that formats the so-called Video Coding Layer (VCL) into a kind of generic base from which network specific formats are generated.
FIG. 3 illustrates an exemplary H.264 stream structure 300. The H.264 stream structure 300 comprises a number of NAL units: Sequence Parameter Set (SPS), Picture Parameters Set (PPS), Instantaneous Decoding Refresh (IDR) Slice 1, Slice 2 310, Slice 3, another PPS . . . . The SPS and the PPS comprise various decoding parameters, the slices comprise image data and the IDR separates Groups of Pictures (GOPs) so that they are independent. Like the other slices, slice 2 310 comprises a header 312 and a body 314 comprising slice data. As will be appreciated encrypting a slice means that also the header (or a part of it) is encrypted and as this header is needed to interpret the NAL, such a scheme is doomed to fail.
The prior art provides some selective encryption solutions for H.264/MPEG-4 AVC.
In “Fast protection of H.264/AVC by selective encryption of CABAC”, Z. Shahid, M. Chaumont, W. Puech, IEEE ICME, 2009, the authors propose to scramble the so-called Exp-Golomb code and the bit sign of quantized DCT coefficients. The Exp-Golomb code can be coded in a so-called “By Pass” mode, which means that the Exp-Golomb code does not affect the CABAC context. Thus, changing the Exp-Golomb code keeps the CABAC compliant with the H.264 standard.
The Exp-Golomb code is modified in “Compliant selective encryption for H264/AVC video steams”, C. Bergeron, C. Lamy-Bergeot, Proceedings of the International Workshop on Multimedia Processing (MMSP '05), pp. 477-480, Shanghai, China, October-November 2005.
Other solutions scramble the Intra Prediction Mode. The level distortion depends on Intra Prediction Mode (IPM) frequencies. The scrambling space is limited in these in-compression schemes. See “An Improved Selective Encryption for H264 Video based on Intra Prediction Mode Scrambling”, J. Jiang, Y Liu, Z. Su, G. Zhang and S. Xing, Journal of Multimedia, vol. 5, no. 5, October 2005, and “A New Video Encryption Algorithm for H264”, Y. Li, L. Liang, Z. Su, J. Jiang, IEEE ICICS, 2005
In general, in-compression schemes suffer from some weaknesses. They are often time consuming and there is sometimes necessary to develop a new H.264 codec/parser as the solution is not complaint with the standard implementation.
In summary, it will be appreciated that the basic JPEG2000 solution cannot be modified to H.264 to scramble CABAC data since the required header data then are inaccessible before decryption. Modifying the CABAC without analysis is likely to crash the H.264 parser and cause the decoder to fail. The main alternatives propose to modify data before the CABAC or to modify the Exp-Colomb code.
Both alternatives have drawbacks, such as limitation of the scrambling space, difficulty to find the best tuning for expected visual degradation, non-standard H.264 codec, a scrambled stream that is non-compliant with the H.264 standard. Further, the bypass mode is often easily identified by an attacker and the scrambling can be prone to brute force attacks.
It can therefore be appreciated that there is a need for an improved selective encryption method for H.264 bit streams that ensures standard compliance. The present invention provides such a solution.