Multimedia data exchanges are ever increasing, thereby leading to a growing demand for distant video communications and to the development of systems whose objectives are to provide confidential and reliable exchanges of information.
The security aspects related to the confidentiality of these exchanges in the methods and systems known today are in general very inadequate. The current video coding standard does not offer any coding capabilities that meet the requirements, the coding schemes such as the MPEG format that rely on coding prediction are by nature poor candidates for encipherment.
Studies at the video coding experts group (VCEG) of the ITU-T were begun in 1999 in order to establish a new video standard capable of offering more effective compression than the compression offered by the existing solutions, while exhibiting a reasonable complexity level in respect of its implementation and ultimately be easily usable for network applications, in particular wireless networks and Internet networks. The MPEG consortium has proposed the creation of a partnership with the VCEG experts group in order to establish a common standard, designated by the name H.264 or MPEG-4 AVC (advanced video coding). The final version of the document ITU JVT-G050, which is at present the normative reference document for this standard, specifies only the video coding aspects.
At present, the main applications of the H.264 standard are:
                real-time duplex services for voice, for example videoconferencing over cable or wireless networks (such as the UMTS Universal Mobile Telecommunication system), with a bitrate of less than 1 Mb/s and a small waiting lag;        good quality and high quality video services for satellite, xDLS, or DVD broadcasting transmission (“streaming”), where the bitrate lies between 1 and 8 Mb/s and where the waiting lag can be significant;        streams of lower quality for video services with a lower bitrate, such as Internet applications (a bitrate of less than 2 Mb/s and a waiting lag which can be significant).        
The H.264 standard also includes two entropy coding modes, the context-adaptive algebraic coding (CABAC) mode which relies on algebraic compression, and the context-adaptive VLC coding (CAVLC) mode which relies on conventional variable-length codes.
Two families of codes are used in the latter mode: the Exp-Golomb codes which are VLC codes having a regular construction, and a CAVLC specific code which is used to code the data of the residual blocks, that is to say the values of the coefficients obtained after rearranging the block in a zigzag.
The coding of a residual block is the most complex part in the H.264 coding method.
FIGS. 1a and 1b represent a scheme of a method of coding and decoding the slices of Intra (I) and Predictive (P) images of the H.264 standard.
As illustrated in these FIG. 1, the method relies on steps of derivation and coding with VLC tables which can depend on the previously coded elements. Depicted therefore are the tables giving the number of nonzero coefficients or Total_coeff, the signs of the following values +/−1, or T1, the levels of the residual coefficients differing from zero, the total number of zero before the last nonzero coefficient or Total_zeros, and the number of zeros preceding each coefficient differing from zero or run_before.
In the description, correspondences between the terms used in these FIGS. 1a and 1b and the English terms usually employed in the standard are used:
Luminance prediction=prediction luma
Chrominance prediction=prediction chroma
Format of the coded block=Coded block pattern
delta of the quantization parameter (QP) of the macroblock=Mb_QP delta
Luminance residual=Luma residual
Chrominance residual=Chroma residual
continuous component (DC)=DC transform coefficient
other components or frequency components (AC)=AC transform coefficients
number of coded coefficients=coeff token
sign of the first successive +/−1=trailing ones (T1's) sign flag
values of the coefficients=coeff level (that is traditionally split into a prefix (level prefix or prefix value of the coefficient) and a suffix (level suffix or suffix value of the coefficient))
total number of remaining zeros=Total zeros
span of zeros preceding the value of the coefficient=run before
number of past macroblocks=Mb skip
macro-block of type P=MB of type P-type of sub macroblock=sub-MB type
reference frame number (frame used for the prediction of type P)=Ref Id
motion vector of the MB or of the sub-MB=Mb vect.
In most enciphering systems, the compressed video datum is processed as any other datum by the enciphering mechanism placed after the video coding method has terminated, and decrypted on the receiver side, before the start of the video decoding method.
Such a scheme adds a latency time and involves more calculations, since either the whole of the coded video stream is enciphered, or it is necessary to segment it into several streams which will be processed separately and thereafter reassembled on the decoder side. Other solutions have been introduced which intimately combine the coding and compression methods.
The encipherment solutions implemented before the compression mechanism lead however to less effective encipherment methods.
It has been shown that the random permutations of the transformed coefficients “deform” the distribution of the probability of these coefficients, rendering the Huffman table less effective for the compression process.
The encipherment, which is the result of the cryptography method, is in particular aimed at ensuring the security of the message and at allowing access to the deciphered version only to authorized persons. The original message, (corresponding to the data to be enciphered and called plaintext is transformed into an enciphered message (composed of the enciphered data), called ciphertext, by virtue of an enciphering mechanism which generally relies on the use of a key, the secure exchange of which between the sender and the receiver guarantees that only the receiver is capable of deciphering the encrypted message.
To be considered secure, the enciphering mechanism must resist various types of attacks, among which is found the known plain attack (relying on the knowledge of the initial message and of its enciphered version).
In cryptography, the advanced encryption standard AES, also known by the name “Rinjdael” algorithm, is a block encipherment process which was adopted by the National Institute of Standards and Technology (NIST) as US FIPS PUB 197 in November 2001 after 5 years of standardization processes.
Replacing the data encryption standard (DES), the AES has a fixed block size of 128 bits and a key size of 128, 192 or 256 bits. No successful attack has currently been identified. This standard was recognized in 2003 by the NASA agency as possessing a sufficient level of security for the data not classified by the American government.
A block encipherment algorithm such as the AES must be used with a confidentiality mode such as the counter mode (termed CTR mode). This mode comprises the application of the encipherment downstream of a suite of input blocks, called counters, to produce a sequence of output blocks which can be used to produce the ciphertex. The reference SP 800-38A Recommendation for Block Cipher Modes of Operation—Methods and Techniques, December 2001 describes how to generate the appropriate unique blocks.
A conventional way to proceed is to combine, by applying an X-OR (or-exclusive) procedure, the output block with the useful data (plaintext) to produce the enciphered data (or ciphertext) and vice versa at the decoder level, as illustrated in FIGS. 2a to 2d. 
The application of the X-OR procedure to useful data with AES in counter mode will generate ciphertext outputs taking all the possible configurations. Typically, for two useful data bits the following enciphered data configurations ‘00’, ‘01’, ‘10’, ‘11’ will be obtained, with equal probabilities.
For video applications within the framework of the present invention, to preserve compatibility with the video standard, only certain configurations will be used. Typically when only the ‘00’ and ‘11’ configurations are possible, encipherment with the CTR with X-OR mode which is a standard mode employed with AES will not be considered. In this type of situation, a solution consists in using the output blocks provided by AES CTR not directly to carry out the encipherment, but to select from among the possible configurations those which could be used as enciphered data for fixed useful data. In this case, to avoid selecting an unauthorized configuration, the possible configurations can be stored in a table with positions from 0 to n−1.
Two cases illustrated in FIG. 2c, 2d can thus be separated:
1) when the number of possible configurations is a power of 2, for configurations n=2 k, it is easy to see that k bits can be used to carry out the encipherment,
                a procedure for doing this which is termed circular, is to use these k bits to select a position i of a data configuration.        AES CTR generating equiprobable output blocks, this circular encipherment exhibits good properties from the standpoint of resistance to encipherment analysis or cryptanalysis, see FIG. 2c for k=2, i=1.2) when the number of possible configurations is not a power of 2, the situation becomes more complex. Using 2 bits signifies that the useful data ‘00’ will have 4 possible permutations, the enciphered data will be selected from among the 3 possible configurations ‘00’ ‘01’ ‘10’. In all cases, a configuration would then be selected at least twice, thereby corresponding to too significant an angle of attack from the standpoint of resistance to deciphering attacks.        
The solution is then to permit a slight angle of attack by allowing slightly asymmetric distributions of the permutations.
In practice, by considering a key of k bits, corresponding to 2 k possible output blocks and a possible suite of n configurations, it is possible to shift from one configuration (i) to another by choosing the next 2 k modulo n (configuration i+2 k[n]).
One and the same configuration is consequently used as enciphered data between ┌2k/n┐=└2k/n┘+1 and └2k/n┘ times, thereby involving a bias α in the distribution probability, defined as the maximum deviation in probability for the distribution considered from an infinitely random distribution, i.e. a uniform distribution (where each configuration has a probability 1/n); α is therefore calculated as:
  α  =      max    (                                                                              ⌊                                                      2                    k                                    n                                ⌋                            +              1                                      2              k                                -                      1            n                                      ,                                                          ⌊                                                2                  k                                n                            ⌋                                      2              k                                -                      1            n                                        )  where └A┘ represents the integer part of the number A (ie the integer immediately less than or equal to A) and |A| the absolute value of the number A.
FIG. 2d illustrates the encipherment solution with n=5, i=1 and k=9 thereby leading to a bias value α≅0.001302.
The value of k to be used will be determined by the security level desired for the application, and fixed so as to be known by the sender and the receiver.