The SVC standard provides a scalable or hierarchical compressed representation of a digital video sequence. It provides support for scalability along the following three axis: temporal, spatial and quality scalability. All the functionalities of the SVC standard also provide for the inclusion of spatial random access.
The SVC compression system, the current version of which is described in particular in the document by J. Reichel, H. Schwarz, and M. Wien. Scalable Video Coding—Working Draft 1. Joint Video Team (JVT) of ISO/IEC MPEG & ITU-T VCEG, Hong Kong, China, January 2005, and constitutes a compatible extension of H264/AVC, provides the spatial random access functionality via the use of groups of macroblocks of pixels, also termed slice groups. The H264/AVC standard defines several types of slice groups and enables it to be indicated which slice group of a given type each 16×16 macroblock belongs to. In particular, the slice groups of type 2 are defined with the aim of performing video coding by regions of interest. The slice groups of type 2 designate foreground slice groups, i.e. rectangular sets of macroblocks, and one background slice group. Consequently, a division of the images of the video sequence into a grid of rectangular slice groups makes it possible to implement the spatial random access function in a compressed video sequence.
However, spatial random access is only made possible by constraining the processes of motion estimation and compensation. More particularly, the objective is to be able to extract a sub-set from the compressed video stream which corresponds to the spatial region which it is desired to decode and to display. This constraint on the temporal prediction across boundaries of slice groups is indicated in a supplemental enhancement message of SEI (Supplemental Enhancement Information) type. This constraint consists of limiting the size of the spatial window authorized for the motion vectors of the macroblock partitions of a given slice group. More particularly, given a macroblock to code, the motion estimation consists, for each macroblock partition, of searching in a list of possible reference images for the macroblock (or macroblock partition) which is the most similar to the current partition in terms of the SAD (Sum of Absolute Differences). Since this search is restricted to a spatial region delimited by one or more rectangular slice groups, the search is constrained and potentially leads to sub-optimal coding in terms of rate-distortion.
The present invention mitigates this drawback.