When encoding an image in a video sequence, the image is recursively split, thus creating a plurality of splitting levels. For instance, the image is first divided into slices (or tiles), each slice forming a data structure that can be decoded independently from other slices of the same image, in terms of entropy coding, signal prediction, and residual signal reconstruction. This division defines a slice level.
Then, each slice may be divided into coding entities of pixels of equal size referred to as Coding Tree Block (CTB), thus defining a CTB level. The size of a Coding Tree Block is typically 64 by 64 pixels.
Each Coding Tree Block may then be broken down into a hierarchical tree of smaller blocks which size may vary and which are the actual blocks of pixels to encode. These smaller blocks to encode are referred to as Coding Unit (CU), thus defining a CU level.
The encoding of a particular Coding Unit is typically predictive. This means that a predictor block is first determined. Next, the difference between the predictor block and the Coding Unit is calculated. This difference is called the residue or residual block. Next, this residue is compressed. Usually, the compression of the residue includes a DCT transform followed by a quantization. The Range Extension of HEVC provides other tools, for instance implicit RDPCM, explicit RDPCM, Residual Rotation, Transform Skip, Transform and Quantization bypass, Rice Parameter Adaptation, Cross-Component Decorrelation and Adaptive Residual Colour Transform.
In practice, the prediction is operated on one or more Prediction Units (PUs) that split the Coding Unit. To be noted that the Coding Unit is the basic unit for which a prediction mode is selected or defined. It means that the PU or PUs forming the CU are all predicted using the prediction mode selected for the whole CU.
The actual encoded information of the Coding Unit is made of some information to indicate the way of determining the predictor block and the compressed residue. Best predictor blocks are blocks as similar as possible to the PUs in order to get a small residue that could be efficiently compressed.
The coding mode is defined based on the method used to determine the predictor block for the predictive encoding method of a Coding Unit.
A first main prediction-based coding mode is referred to as INTRA mode. According to INTRA mode, the predictor block is built based on the value of pixels immediately surrounding the Coding Unit within the current image. It is worth noting that the predictor block is not a block of the current image but a construction. A direction is used to determine which pixels of the border are actually used to build the predictor block and how they are used. The idea behind INTRA mode is that, due to the general coherence of natural images, the pixels immediately surrounding the Coding Unit are likely to be similar to pixels of the current Coding Unit. Therefore, it is possible to get a good prediction of the value of pixels of the Coding Unit using a predictor block based on these surrounding pixels.
Conventional INTRA coding defines a plurality of modes: planar mode, DC mode and 32 directional modes (including a horizontal mode and a vertical mode).
Variations of the INTRA coding have been progressively introduced in HEVC. For instance, Intra Block Copy (IBC) coding is proposed to use a block predictor from the causal area of the current image being reconstructed. Also, the Palette mode has been defined which does not require a residue to be transmitted to the decoder.
A second main prediction-based coding mode is referred to as INTER mode. According to INTER mode, the predictor block is a block of another image. The idea behind the INTER mode is that successive images in a sequence are generally very similar. The main difference comes typically from a motion between these images due to the scrolling of the camera or due to moving objects in the scene. The predictor block is determined by a vector giving its location in a reference image relatively to the location of the Coding Unit within the current image. This vector is referred to as a motion vector. According to this mode, the encoding of such Coding Unit using this mode comprises motion information comprising the motion vector and the compressed residue.
Variations of the INTER coding have been introduced in HEVC. In particular, the Merge mode consists to predict the whole motion information to reduce the transmitted data. In Merge mode, a single predictor index is transmitted in addition to the compressed residue, predictor index from which the decoder is able to reconstruct the motion information. Another variation of INTER coding is the Skip Merge mode in which no residue is transmitted in the bitstream.
To find the best coding mode for a current Coding Unit being encoded, each coding mode is evaluated, often plenty of times since a plurality of “options” may be activated or not, in particular the tools provided by the Range Extension of HEVC.
As introduced above, the Adaptive Residual Colour Transform is one of these tools. In short, the Residual Colour Transform (RCT) consists to convert colour pixel components of the residue from a colour space to another colour space. The current version of HEVC for “screen contents” provides a RGB-to-YCoCg colour transform for the residues. This tool is very efficient to decorrelate RGB signal, thus offering Co and Cg residues with very few values (thus improving coding rate).
Note that “screen content” opposes to natural sequences in video sequences. The “screen content” video sequences refer to particular video sequences which have a very specific content corresponding to those captured from a personal computer of any other device containing for example text, PowerPoint presentation, Graphical User Interface, tables (e.g. screen shots). These particular video sequences have quite different statistics compared to natural video sequences. In video coding, performance of conventional video coding tools, including HEVC, proves sometimes to be underwhelming when processing such “screen content”.
The Residual Colour Transform is said to be “Adaptive” because the decision to apply it or not to the Coding Units is taken at Coding Unit level. It means that a corresponding flag, known as cu_residual_act_flag in the current version of HEVC, is provided in the bitstream for each Coding Unit, when the video sequence enables RCT.
The inventors have noticed that the Adaptive Residual Colour Transform is not useful for all the prediction-based coding modes. In addition, it has been observed that determining the value of the cu_residual_act_flag is quite costly at the encoder side. Also, at the decoder side, the presence of the Adaptive RCT increases the decoding complexity since it forms one of the numerous cascaded tools implemented for residual decoding.
The present invention seeks to overcome one or more of the foregoing drawbacks.