A palette in this document is defined as a look up table having entries associating an index with a value of a pixel. Typically, but not necessary, the value of a pixel is constituted by the value of each colour component associated to the pixel, resulting in a colour palette. On the other hand, the value of a pixel may be made of a single pixel component, resulting in a monochrome palette.
This mode of encoding a block of pixel is generally referred to as Palette coding mode. It is contemplated to adopt this mode, for example, in the Range Extension of the High Efficiency Video Coding (HEVC: ISO/IEC 23008-2 MPEG-H Part 2/ITU-T H.265) international standard.
When encoding an image in a video sequence, the image is first divided into coding entities of pixels of equal size referred to as Coding Tree Block (CTB). The size of a Coding Tree Block is typically 64 by 64 pixels. Each Coding Tree Block may then be broken down into a hierarchical tree of smaller blocks which size may vary and which are the actual blocks of pixels to encode. These smaller blocks to encode are referred to as Coding Unit (CU).
The encoding of a particular Coding Unit is typically predictive. This means that a predictor block is first determined. Next, the difference between the predictor block and the Coding Unit is calculated. This difference is called the residue. Next, this residue is compressed. The actual encoded information of the Coding Unit is made of some information to indicate the way of determining the predictor block and the compressed residue. Best predictor blocks are blocks as similar as possible to the Coding Unit in order to get a small residue that could be efficiently compressed.
The coding mode is defined based on the method used to determine the predictor block for the predictive encoding method of a Coding Unit.
A first coding mode is referred to as INTRA mode. According to INTRA mode, the predictor block is built based on the value of pixels immediately surrounding the Coding Unit within the current image. It is worth noting that the predictor block is not a block of the current image but a construction. A direction is used to determine which pixels of the border are actually used to build the predictor block and how they are used. The idea behind INTRA mode is that, due to the general coherence of natural images, the pixels immediately surrounding the Coding Unit are likely to be similar to pixels of the current Coding Unit. Therefore, it is possible to get a good prediction of the value of pixels of the Coding Unit using a predictor block based on these surrounding pixels.
A second coding mode is referred to as INTER mode. According to INTER mode, the predictor block is a block of another image. The idea behind the INTER mode is that successive images in a sequence are generally very similar. The main difference comes typically from a motion between these images due to the scrolling of the camera or due to moving objects in the scene. The predictor block is determined by a vector giving its location in a reference image relatively to the location of the Coding Unit within the current image. This vector is referred to as a motion vector. According to this mode, the encoding of such Coding Unit using this mode comprises motion information comprising the motion vector and the compressed residue.
We focus in this document on a third coding mode called Palette mode. According to a first variant of the Palette mode, it is possible to define a predictor block for a given Coding Unit as a block of indexes from a palette: for each pixel location in the predictor block, the predictor block contains the index associated with the pixel value in the Palette which is the closest to the value of the pixel having the same location (i.e. colocated) in the coding unit. A residue representing the difference between the predictor block and the coding unit is then calculated and encoded. Entry indexes in the Palette are also known as “levels”.
When using the Palette mode according to this first variant, the predictor block of indexes has to be transmitted in the bitstream. For this transmission, the predictor block of indexes is binary encoded using three syntax elements. A first syntax element, called “Pred mode” allows distinguishing between two encoding modes. In a first mode corresponding to a Pred mode having the value 0 (also known as “copy left mode”,“left prediction mode” or “index mode”), the value of the level to be encoded has to be transmitted in the bitstream. In a second mode corresponding to a Pred mode having the value 1, the value of the level to be encoded is obtained from the value of the above pixel in the predictor block. The level does not have to be transmitted.
According to a second variant of the Palette mode, it is also possible to define an index block for predicting a given Coding Unit from a palette: for each pixel location in the CU, the index block contains the index associated with the pixel value in the Palette which is representative of the value of the pixel having the same location (i.e. colocated) in the coding unit. Additional index values named “Escape values” are also generated if a pixel value cannot be associated to an index value from the Palette. This “Escape value” indicates that the corresponding pixel value is directly encoded.
According to this second variant, the index block and the escape values are transmitted in the bitstream with the Palette. The same syntax elements as mentioned for the first variant are used
It is worth noting that while the block of indexes is not strictly speaking a part of an image, the word “pixel” is used to refer to an element of this block of levels by analogy.
A second syntax element called “Level” is defined for the transmission of the value of a level in the first mode. The third syntax element, called “Run” is used to encode a repetition value. Considering that the predictor block is scanned from the top left corner to the bottom right corner, row by row from left to right and top to bottom (i.e. the raster scan order), the Run syntax element gives the number of successive pixels in the predictor block having the same encoding. If the Pred mode is 0, this is the number of successive pixels of the predictor block having the same level value. If the Pred mode is 1, this is the number of successive pixels of the predictor block having a level value corresponding to the level value of the above pixel.