Technologies for coding non-camera-captured content videos or screen content videos have received great interests lately due to the rapid growth of application areas such as wireless displays, remote computer desktop access, real-time screen sharing for videoconferencing, cloud gaming, etc. Compared to camera-captured content videos, which contain rich colors and complex patterns, screen content videos contain a significant portion of computer-rendered graphics and text with fewer number of colors and the repetition of textual patterns.
For example, in a screen content image with text, a coding block typically contains only the foreground text color and the background color. Sometimes, the random patterns of text characters and letters make it challenging for the current coding block to find a matching block in the same or previously coded pictures. It may also be challenging to utilize the directional local intra prediction for efficient compression in this circumstance. Since traditional intra and inter video coding tools were designed primarily for camera-captured content videos, and screen content videos have significantly different characterizes from camera-captured content videos, these traditional intra and inter video coding tools are less sufficient for screen content videos. Therefore, it creates an urgent need for efficient coding of screen content videos.
In response to the market demands, the ITU-T Video Coding Expert Group and ISO/IEC Motion Picture Expert Group have jointly launched a new standardization project, i.e., the High Efficiency Video Coding (HEVC) extensions on screen content coding (SCC). Several new video coding tools, including palette coding, have been developed and adopted into HEVC SCC draft standard to efficiently encode/decode screen content videos.
Palette coding is a major color-based prediction method. Different from traditionally intra and inter prediction that mainly removes redundancy between different coding units, palette coding targets at the redundancy of repetitive pixel values/patterns within the coding unit. In order to reduce the overhead of transmitting the original value of the major colors, a palette prediction was introduced in palette coding. In the current palette coding mode, all pixels of a coding block are analyzed and classified into a list of major colors, except for some rarely used pixels that cannot be classified to any of the major colors, which are classified into escape colors. Each major color is a representative color which has high frequency of occurrence in the coding block. For each palette coded coding unit (CU), a color index table, i.e., a palette, is formed with each index entry associated with one major color. All the pixels in the CU are converted into corresponding indices, except the escape pixels with the escape colors. FIG. 1 illustrates a simplified version of the palette coding process.
Then, the encoder starts a checking process to check if each of the index entries representing the major colors in the current CU matches any of the major colors in the current palette predictor. For each entry in the palette predictor, a flag (1: used; 0: not used) is sent to signal whether or not this entry is used in the current palette. If yes, this entry will be put in front of the current palette. Therefore, the flags corresponding to the entries of the current palette predictor are sent to signal which one(s) of the major colors in the current palette predictor is used in the current CU. For those entries in the current palette but not in the palette predictor, the number of them and their pixel (e.g., Y/Cb/Cr or R/G/B) values are signaled, and these signaled new entries are put at the bottom of the current palette. An example of the palette coding mode at the encoder side is illustrated in FIG. 2. The current palette size is then calculated as the number of reused palette entries plus the number of signaled new palette entries.
At the decoder side, the decoder receives the flags indicating which one(s) of the major colors in the current palette predictor is used in the current CU. The decoder checks the current palette predictor in order with the flags to determine which one(s) of the major colors in the current palette predictor is used in the current CU. The decoder also receives their pixel (e.g., Y/Cb/Cr or R/G/B) values of the new palette entries not in the current palette predictor. The decoder then generates a received palette for the CU with index entries corresponding to the used major colors (with the flag (1)) in the current palette predictor in front of the received palette, followed by the index entries corresponding to new major colors not in the current palette predictor. An example of the palette coding mode at the decoder side is illustrated in FIG. 3.
After palette coding the current CU or decoding the current palette-coded CU, the palette predictor is updated for the next CU or palette-coded CU. This is done using the information of the current palette. The entries (including the new entries) of the current/received palette are put in front of the new palette predictor, followed by those unused entries from the previous palette predictor. The new palette predictor size is then calculated as the size of the current palette plus the number of unused palette entries. An example of the update of the palette predictor is illustrated in FIG. 4.
However, in the current design, the maximum size of the palette predictor can be assigned as an arbitrary positive number. The lack of the limitation of the palette predictor maximum size raises problems in implementation and the update process of the palette predictor, because the decoder may need to prepare a buffer with an unlimited size for hardware implementation of the palette prediction, which is infeasible under the current technology. In addition, the current palette coding mode only allows one fixed maximum size of the palette predictor signaled at the SPS level regardless of the complexity of the video or the coding quality. The fixed palette predictor maximum size makes the palette coding inefficient and ineffective, because it may not fit the need for all different coding conditions and coding quality requirements.