1. Field of the Invention
The present invention concerns a method and a device for coding a sequence of images.
More particularly, it concerns the coding of an image sequence comprising at least one image sequence fragment decodable independently of the rest of the image sequence.
2. Description of the Related Art
When a camera captures a sequence of images or video, that sequence of images is coded in a video compression format. The sequence of images is thus converted into a compressed video data stream.
The coding methods used are standard video coding methods, such as the H.264/AVC standard, providing functionalities for accessing and extracting a video fragment in a compressed data stream.
Thus, these coding methods have functionalities making it possible to view an image region of interest over a time interval or video fragment. For this, the compressed video data stream must contain spatio-temporal access points enabling the extraction of a part of the compressed data stream, or data sub-streams, containing data corresponding to the image region of interest. These coding methods enable the data stream to be divided into several data sub-streams that are individually decodable.
For example, the H.264/AVC video standard divides the image to code into groups of macroblocks or slice groups. Each slice group is coded so as to be individually decodable.
On coding an image, the coding methods use spatial and temporal predictions.
Spatial prediction enables an image block to be coded predictively on the basis of neighboring image blocks.
Temporal prediction enables an image block to be coded predictively on the basis of at least one reference image.
The spatial and temporal predictions used by the coding methods are known to the person skilled in the art and will not be described in more detail.
The coding methods constrain the spatial and temporal predictions in order for each slice group to be decodable independently of the other slice groups.
FIG. 1 illustrates a sequence of images 1 coded using known coding methods, having a functionality for accessing and extracting a data sub-stream corresponding to an image region of interest over a time interval (video fragment or image sequence fragment). This data sub-stream or video fragment is included in the data stream corresponding to a sequence of images, and is decodable independently of the rest of the data stream.
Each image 1a of the sequence of images 1 is divided into macroblocks 1b. A macroblock 1b is, for example, a block of 16×16 pixels.
This sequence of images 1 has an image region of interest 2 over a time interval. Thus, the image region of interest 2 is represented on each image 1a belonging to a subset of images 3 corresponding to the time interval.
In order to provide the access to that image region of interest 2 over the time interval (video fragment or image sequence fragment), the coding methods, for example the H.264/AVC coding, define, in each image of the subset of images 3, a group of macroblocks or slice group 4 containing the macroblocks 1b having an intersection with the image region of interest 2.
Thus, in this example, each image 1a belonging to the subset of images 3 comprises two slice groups, a first slice group 4 corresponding to the image region of interest 2, and a second slice group 4a corresponding to the rest of the image 1a. 
In this example, slice group 4 corresponding to the image region of interest 2 is of rectangular geometric shape. This geometric shape is thus defined prior to the coding process of each slice group 4, 4a. 
Since each slice group 4, 4a must be able to be decoded independently of the other, no spatial prediction is possible between macroblocks 1b belonging to different slice groups 4.
Consequently, this constraint on the spatial prediction reduces the compression efficiency of certain macroblocks, in particular macroblocks situated on the edge of the image region of interest 2, since certain neighboring macroblocks cannot be used at the time of the coding.
Furthermore, the temporal prediction of the macroblocks of the slice group 4 corresponding to the image region of interest 2, is only implemented with reference to macroblocks also belonging to the slice group corresponding to the region of interest in the reference image or images.
This constraint on the temporal prediction is also detrimental to the compression efficiency. In particular, if an object of the image is not present in the slice group corresponding to the region of interest on the reference image or images, the macroblocks containing that object must be coded in INTRA, that is to say using spatial predictions only. INTRA is much more costly in terms of size of the data necessary for coding the macroblock.