High Efficiency Video Coding (HEVC) is a new coding standard that has been developed in recent years. In the High Efficiency Video Coding (HEVC) system, the fixed-size macroblock of H.264/AVC is replaced by a flexible block, named coding unit (CU). Pixels in the CU share the same coding parameters to improve coding efficiency. A CU may begin with a largest CU (LCU), which is also referred as coded tree unit (CTU) in HEVC. In addition to the concept of coding unit, the concept of prediction unit (PU) is also introduced in HEVC. Once the splitting of CU hierarchical tree is done, each leaf CU is further split into one or more prediction units (PUs) according to prediction type and PU partition.
Along with the High Efficiency Video Coding (HEVC) standard development, the development of extensions of HEVC has also started. The HEVC extensions include range extensions (RExt) which target at non-4:2:0 color formats, such as 4:2:2 and 4:4:4, and higher bit-depths video such as 12, 14 and 16 bits per sample. One of the likely applications utilizing RExt is screen sharing, over wired- or wireless-connection. Due to specific characteristics of screen contents, coding tools have been developed and demonstrate significant gains in coding efficiency. Among them, the palette coding (a.k.a. major color based coding) techniques represent block of pixels using indices to the palette (major colors), and encode the palette and the indices by exploiting spatial redundancy. While the total number of possible color combinations is huge, the number of colors in an area of picture is usually very limited for typical screen contents. Therefore, the palette coding becomes very effective for screen content materials.
During the early development of HEVC range extensions (RExt), several proposals have been disclosed to address palette-based coding. For example, a palette prediction and sharing technique is disclosed in JCTVC-N0247 (Guo et al., “RCE3: Results of Test 3.1 on Palette Mode for Screen Content Coding”, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 14th Meeting: Vienna, AT, 25 Jul.-2 Aug. 2013 Document: JCTVC-N0247). In JCTVC-N0247, the palette of each color component is constructed and transmitted. The palette can be predicted (or shared) from its left neighboring CU to reduce the bitrate.
Palette Coding
An improved palette prediction and sharing technique is disclosed in JCTVC-O0218 (Guo et al., “Evaluation of Palette Mode Coding on HM−12.0+RExt−4.1”, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 15th Meeting: Geneva, CH, 23 Oct.-1 Nov. 2013, Document: JCTVC-O0218). In JCTVC-O0218, the encoding process is shown as follows.    1. Transmission of the palette: the palette size (number of colors in the palette) is first transmitted, followed by the palette elements (the color values).    2. Transmission of pixel palette index values (indices pointing to the colors in the palette): the index values for the pixels in the CU are encoded in a raster scan order. For each position, a flag is first transmitted to indicate whether the “run mode” or “copy above mode” is being used.    2.1 “Run mode”: In “run mode”, a palette index is first signaled followed by “palette_run” (e.g., M). No further information needs to be transmitted for the current position and the following M positions as they have the same palette index as signaled. The palette index (e.g., i) is shared by all three color components, which means that the reconstructed pixel values are (Y, U, V)=(paletteY[i], paletteU[i], paletteV[i]) (assuming the color space is YUV)    2.2 “Copy above mode”: In “copy above mode”, a value “copy_run” (e.g., N) is transmitted to indicate that for the following N positions (including the current one), the palette indices are equal to the palette indices of the ones that are at the same positions in the row above.    3. Transmission of residue: the palette indices transmitted in Stage 2 are converted back to color values and used as the predictor. Residue information is transmitted using HEVC residue coding and is added to the prediction for the reconstruction.
Major-Color-Based (or Palette) Coding
Another palette coding technique is disclosed in JCTVC-O-0182 (Guo et al., “AHG8: Major-color-based screen content coding”, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 15th Meeting: Geneva, CH, 23 Oct.-1 Nov. 2013, Document: JCTVC-O0182). However, instead of predicting the entire palette from the left CU, individual palette color entry in a palette can be predicted from the exact corresponding palette color entry in the above CU or left CU. In other words, JCTVC-O-0182 discloses an element-by-element palette prediction. Three types of line modes are used for predicting each index line, i.e. horizontal mode, vertical mode and normal mode. In the horizontal mode, all the indices in the same line have the same value. If the value is the same as the first pixel of the above pixel line, only the line mode signaling bits are transmitted. Otherwise, the index value is also transmitted. In vertical mode, the current index line is the same with the above index line. Therefore, only line mode signaling bits are transmitted. In normal mode, indices in a line are predicted individually. For each index position, the left or above neighbors is used as predictor, and the prediction symbol is transmitted to the decoder.
Furthermore, JCTVC-O-0182 discloses a technique that classifies pixels into major color pixels (with palette indices pointing to the palette colors) and escape pixel. For major color pixels, the decoder reconstructs pixel value according to major color index (also referred as palette index) and palette. For escape pixel, the encoder would further send the pixel value.
Signaling of Palette Table
In the reference software of screen content coding (SCC) standard, SCM-2.0, an improved palette scheme is integrated in JCTVC-R0348 (Onno, et al., Suggested combined software and text for run-based palette mode, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 18th Meeting: Sapporo, J P, July 2014, Document No.: JCTVC-R0348). The palette table of previous palette-coded CU is used as a predictor for current palette table coding. In palette table coding, the current palette table is signaled by choosing which palette colors in the previous coded palette table (palette predictor) are reused, or by transmitting new palette colors. The size of the current palette is set as the size of the predicted palette (i.e., numPredPreviousPalette) plus the size of the transmitted palette (i.e., num_signalled_palette_entries). The predicted palette is a palette derived from the previously reconstructed palette coded CUs. When coding the current CU as a palette mode, those palette colors that are not predicted using the predicted palette are directly transmitted in the bitstream (i.e., signaled entries).
An example of palette updating is shown as follows. In this example, the current CU is coded as palette mode with a palette size equal to six. Three of the six major colors are predicted from the palette predictor (numPredPreviousPalette=3) and three are directly transmitted through the bitstream. The transmitted three colors can be signaled using the exemplary syntax shown below.    num_signalled_palette_entries=3    for(cIdx=0; cIdx<3; cIdx++)//signal colors for different components            for(i=0; i<num_signalled_palette_entries; i++)        palette_entries[cIdx] [numPredPreviousPalette+i]        
Since the palette size is six in this example, the palette indices from 0 to 5 are used to indicate the major color entries in the palette color table. The 3 predicted palette colors are represented with indices 0 to 2. Accordingly, three new palette entries are transmitted for indexes 3 through 5.
In SCM-2.0, if the wavefront parallel processing (WPP) is not applied, the palette predictor table is initialized (reset) at the beginning of each slice or at the beginning of each tile. If the WPP is applied, the last coded palette table is not only initialized (reset) at the beginning of each slice or at the beginning of each tile, but also initialized (reset) at the beginning of each CTU row.
Wavefront Parallel Processing (WPP)
In HEVC, WPP is supported, where each row of Coding Tree Units (CTUs) can be processed in parallel as sub-streams by multiple encoding or decoding threads. In order to limit the degradation of coding efficiency, a wavefront pattern of processing order ensures that dependencies on spatial neighbors are not changed. On the other hand, at the start of each CTU row, the CABAC states are initialized based on the CABAC states of the synchronization point in upper CTU row. For example, the synchronization point can be the last CU of the second CTU from the upper CTU row as shown in FIG. 1, where the parallel processing is applied to CTU rows. Furthermore, it is assumed in this example that the palette coding of each current CTU (marked as “X” in FIG. 1) depends on its left, above-left, above and above-right CTUs. For the top CTU row, the palette processing is dependent on the left CTU only. Moreover, CABAC engine is flushed at the end of each CTU row and byte alignment is enforced at the end of each sub-stream. The entry points of WPP sub-streams are signaled as byte offsets in the slice header of the slice that contains the wavefront.
In FIG. 1, each block stands for one CTU and there are four CTU rows in a picture. Each CTU row forms a wavefront sub-stream that can be processed independently by an encoding or a decoding thread. The “X” symbols represent the current CTU under processing for the multiple threads. Since a current CTU has dependency on the above-right CTU, the processing of the current CTU has to wait for the completion of the above-right CTU. Therefore, there must be two CTUs delay between two processing threads of neighboring CTU rows so that the data dependency (e.g. spatial pixels and MVs) can be preserved. In addition, the CABAC states of the first CTU of each CTU row is initialized with the states obtained after the second CTU of the upper CTU row is processed. For example, the first CU (indicated by “p1”) of the first CTU in the second CTU row is initialized after the last CU (indicated by “p2”) in second CTU of the above CTU row is processed. The dependency is indicated by a curved arrow line pointing from “p1” to “p2”. Similar dependency for the first CTU of each CTU row is indicated by the curved arrows. This allows for a quicker learning of the probabilities along the first column of CTUs than using the slice initialization states for each CTU row. Since the second CTU of the upper CTU row is always available to the current CTU row, parallel processing can be achieved using this wavefront structure. For each current CTU, the processing depends on the left CTU. Therefore, it has to wait until the last CU of the left CTU is processed. As shown in FIG. 1, a first CU (indicated by “p3”) in a current CTU has to wait for the last CU (indicated by “p4”) of the left CTU to finish. Again, the dependency is indicated by a curved arrow line pointing from “p3” to “p4”. Similar dependency on the left CTU is indicated by curved arrows for the CTU being process (indicated by “X”).
Palette Stuffing
In the reference software of screen content coding (SCC) standard, SCM-2.0, the palette information is predictive coded. The palette predictor of the current CU is generated by stuffing the palette predictor of the previous CU into the palette of the previous CU. FIG. 2 illustrates an exemplary palette prediction. The maximum palette predictor size (i.e., the number of colors in the palette predictor) is 7. FIG. 2 shows that the palette predictor, reuse flags (indicating which colors from the palette predictor are reused for the current CU's palette), and palette of the current CU on the left side, are used to generate the palette predictor for the next CU on the right side, by stuffing the palette predictor of the current CU into the palette of the current CU. The current palette consists of 3 entries corresponding to C3, C5 and C8. The unused entries are stuffed one by one after the last entry in the current palette, until reaching the maximum palette predictor size, to form the next palette.
According to the current practice, the palette updating process is performed for every palette coded CU. It is desirable to develop methods for reducing the complexity or memory associated with the palette coding without noticeable performance impact.