High Efficiency Video Coding (HEVC) is a new coding standard that has been developed in recent years. In the High Efficiency Video Coding (HEVC) system, the fixed-size macro block of H.264/AVC is replaced by a flexible block, named coding unit (CU). Pixels in the CU share the same coding parameters to improve coding efficiency. A CU may begin with a largest CU (LCU), which is also referred as coded tree unit (CTU) in HEVC. In addition to the concept of coding unit, the concept of prediction unit (PU) is also introduced in HEVC. Once the splitting of CU hierarchical tree is done, each leaf CU is further split into one or more prediction units (PUs) according to prediction type and PU partition. Several coding tools for screen content coding have been developed. These tools related to the present invention are briefly reviewed as follow.
Palette Coding
During the development of HEVC range extensions (RExt), several proposals have been disclosed to address palette-based coding. For example, a palette prediction and sharing technique is disclosed in JCTVC-N0247 (Guo et al., “RCE3: Results of Test 3.1 on Palette Mode for Screen Content Coding”, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 14th Meeting: Vienna, AT, 25 Jul.-2 Aug. 2013 Document: JCTVC-N0247) and JCTVC-O0218 (Guo et al., “Evaluation of Palette Mode Coding on HM-12.0+RExt-4.1”, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 15th Meeting: Geneva, CH, 23 Oct.-1 Nov. 2013, Document: JCTVC-O0218). In JCTVC-N0247 and JCTVC-O0218, the palette of each color component is constructed and transmitted. The palette can be predicted (or shared) from its left neighboring CU to reduce the bitrate. All pixels within the given block are then coded using their palette indices. An example of encoding process according to JCTVC-N0247 is shown as follows.                1. Transmission of the palette: the color index table (also called palette table) size is first transmitted followed by the palette elements (i.e., color values).        2. Transmission of pixel values: the pixels in the CU are encoded in a raster scan order. For each group of one or more pixels, a flag for a run-based mode is first transmitted to indicate whether the “copy index mode” or “copy above mode” is being used.        2.1 “Copy index mode”: In the copy index mode, a palette index is first signaled followed by “palette_run” (e.g., M) representing the run value. The term palette_run may also be referred as pixel_run in this disclosure. The run value indicates that a total of M samples are all coded using copy index mode. No further information needs to be transmitted for the current position and the following M positions since they have the same palette index as that signaled in the bitstream. The palette index (e.g., i) may also be shared by all three color components, which means that the reconstructed pixel values are (Y, U, V)=(paletteY[i], paletteU[i], paletteV[i]) for the case of YUV color space.        2.2 “Copy above mode”: In the copy above mode, a value “copy_run” (e.g. N) is transmitted to indicate that for the following N positions (including the current one), the palette index is the same as the corresponding palette index in the row above.        3. Transmission of residue: the palette indices transmitted in Stage 2 are converted back to pixel values and used as the prediction. Residue information is transmitted using HEVC residual coding and is added to the prediction for the reconstruction.        
Both “copy index mode” and “copy above mode” are referred as copy modes for palette index coding in this disclosure. Besides, the palette mode is also referred to as palette coding mode in the following descriptions.
In JCTVC-N0247, palette of each component are constructed and transmitted. The palette can be predicted (shared) from its left neighboring CU to reduce the bitrate. In JCTVC-00218, each element in the palette is a triplet, which represents a specific combination of the three color components. Furthermore, the predictive coding of palette across CU is removed.
Another palette coding technique similar to JCTVC-00218 has also been disclosed. Instead of predicting the entire palette table from the left CU, individual palette color entry in a palette is predicted from the exact corresponding palette color entry in the above CU or left CU.
For transmission of pixel palette index values, a predictive coding method is applied on the indices as disclosed in JCTVC-00182 (Guo et al., “AHG8. Major-color-based screen content coding”, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 15th Meeting: Geneva, CH, 23 Oct.-1 Nov. 2013, Document: JCTVC-O0182). Three types of line modes, i.e., horizontal mode, vertical mode and normal mode are used for coding each index line. In the horizontal mode, all the indices in the same line have the same value. If the value is the same as the first pixel of the above pixel line, only line mode signaling bits are transmitted. Otherwise, the index value is also transmitted. In the vertical mode, it indicates that the current index line is the same with the above index line. Therefore, only line mode signaling bits are transmitted. In normal mode, indices in a line are predicted individually. For each index position, the left or above neighbors is used as a predictor, and the prediction symbol is transmitted to the decoder.
Furthermore, pixels are classified into major color pixels (with palette indices pointing to the palette colors) and escape pixel according to JCTVC-O0182. For major color pixels, the pixel value is reconstructed according to the major color index (i.e., palette index) and palette table in the decoder side. For escape pixel, the pixel value is further signaled in the bitstream.
Palette Table Signaling
In the reference software of screen content coding (SCC) standard, SCM-2.0 (Joshi et al., Screen content coding test model 2 (SCM 2), Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 18th Meeting: Sapporo, JP, July 2014, Document No.: JCTVC-R1014), an improved palette scheme is integrated in JCTVC-R0348 (Onno, et al., Suggested combined software and text for run-based palette mode, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 18th Meeting: Sapporo, JP, July 2014, Document No.: JCTVC-R0348). The palette table of previous palette-coded CU is used as a predictor for current palette table coding. In palette table coding, the current palette table is signaled by choosing which palette colors in the previous coded palette table (palette predictor) are reused, or by transmitting new palette colors. The size of the current palette is set as the size of the predicted palette (i.e., numPredPreviousPalette) plus the size of the transmitted palette (i.e., num_signalled_palette_entries). The predicted palette is a palette derived from the previously reconstructed palette coded CUs. When coding the current CU as a palette mode, those palette colors that are not predicted using the predicted palette are directly transmitted in the bitstream (i.e., signaled entries).
An example of palette updating is shown as follows. In this example, the current CU is coded as palette mode with a palette size equal to six. Three of the six major colors are predicted from the palette predictor (numPredPreviousPalette=3) and three are directly transmitted through the bitstream. The transmitted three colors can be signaled using the exemplary syntax shown below.
num_signalled_palette_entries = 3for( cIdx = 0; cIdx < 3; cIdx++ ) // signal colors for different componentsfor( i = 0; i < num_signalled_palette_entries; i++ )palette_entries[ cIdx ][ numPredPreviousPalette + i ]
Since the palette size is six in this example, the palette indices from 0 to 5 are used to indicate the major color entries in the palette color table. The 3 predicted palette colors are represented with indices 0 to 2. Accordingly, three new palette entries are transmitted for indexes 3 through 5.
In SCM-2.0, if the wavefront parallel processing (WPP) is not applied, the palette predictor table is initialized (reset) at the beginning of each slice or at the beginning of each tile. If the WPP is applied, the last coded palette table is not only initialized (reset) at the beginning of each slice or at the beginning of each tile, but also initialized (reset) at the beginning of each CTU row.
Wavefront Parallel Processing (WPP)
In HEVC, WPP is supported, where each row of Coding Tree Units (CTUs) can be processed in parallel as sub-streams by multiple encoding or decoding threads. In order to limit the degradation of coding efficiency, a wavefront pattern of processing order ensures that dependencies on spatial neighbors are not changed. On the other hand, at the start of each CTU row, the CABAC states are initialized based on the CABAC states of the synchronization point in upper CTU row. For example, the synchronization point can be the last CU of the second CTU from the upper CTU row as shown in FIG. 1, where the parallel processing is applied to CTU rows. Furthermore, it is assumed in this example that the palette coding of each current CTU (marked as “X” in FIG. 1) depends on its left, above-left, above and above-right CTUs. For the top CTU row, the palette processing is dependent on the left CTU only. Moreover, CABAC engine is flushed at the end of each CTU row and byte alignment is enforced at the end of each sub-stream. The entry points of WPP sub-streams are signaled as byte offsets in the slice header of the slice that contains the wavefront.
In FIG. 1, each block stands for one CTU and there are four CTU rows in a picture. Each CTU row forms a wavefront sub-stream that can be processed independently by an encoding or a decoding thread. The “X” symbols represent the current CTU under processing for the multiple threads. Since a current CTU has dependency on the above-right CTU, the processing of the current CTU has to wait for the completion of the above-right CTU. Therefore, there must be two CTUs delay between two processing threads of neighboring CTU rows so that the data dependency (e.g. spatial pixels and motion vectors (MVs)) can be preserved. In addition, the CABAC states of the first CTU of each CTU row is initialized with the states obtained after the second CTU of the upper CTU row is processed. For example, the first CU (indicated by “p1”) of the first CTU in the second CTU row is initialized after the last CU (indicated by “p2”) in second CTU of the above CTU row is processed. The dependency is indicated by a curved arrow line pointing from “p1” to “p2”. Similar dependency for the first CTU of each CTU row is indicated by the curved arrows. This allows for a quicker learning of the probabilities along the first column of CTUs than using the slice initialization states for each CTU row. Since the second CTU of the upper CTU row is always available to the current CTU row, parallel processing can be achieved using this wavefront structure. For each current CTU, the processing depends on the left CTU. Therefore, it has to wait until the last CU of the left CTU is processed. As shown in FIG. 1, a first CU (indicated by “p3”) in a current CTU has to wait for the last CU (indicated by “p4”) of the left CTU to finish. Again, the dependency is indicated by a curved arrow line pointing from “p3” to “p4”. Similar dependency on the left CTU is indicated by curved arrows for the CTU being process (indicated by “X”).
Intra Block Copy
Anew Intra coding mode, named Intra-block copy (IntraBC) has been used. The IntraBC technique that was originally proposed by Budagavi in AHG8. Video coding using Intra motion compensation, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 13th Meeting: Incheon, KR, 18-26 Apr. 2013, Document: JCTVC-M0350 (hereinafter JCTVC-M0350). An example according to JCTVC-M0350 is shown in FIG. 2, where a current coding unit (CU, 210) is coded using Intra MC (motion compensation). The prediction block (220) is located from the current CU and a displacement vector (212). In this example, the search area is limited to the current CTU (coding tree unit), the left CTU and the left-left CTU. The prediction block is obtained from the already reconstructed region. Then, the displacement vector (i.e., MV), and residual for the current CU are coded. It is well known that the HEVC adopts CTU and CU block structure as basic units for coding video data. Each picture is divided into CTUs and each CTU is reclusively divided into CUs. During prediction phase, each CU may be divided into multiple blocks, named prediction units (PUs) for performing prediction process. After prediction residue is formed for each CU, the residue associated with each CU is divided into multiple blocks, named transform units (TUs) to apply transform (such as discrete cosine transform (DCT)).
In JCTVC-M0350, the Intra MC is different from the motion compensation used for Inter prediction in at least the following areas:
MVs are restricted to be 1-D for Intra MC (i.e., either horizontal or vertical) while Inter prediction uses 2-D motion estimation. The MVs are also referred to as block vectors (BVs) for Intra copy prediction.
Binarization is fixed length for Intra MC while Inter prediction uses exponential-Golomb.
Intra MC introduces a new syntax element to signal whether the MV is horizontal or vertical.
Based on JCTVC-M0350, some modifications are disclosed by Pang, et al. in Non-RCE3: Intra Motion Compensation with 2-D MVs, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 14th Meeting: Vienna, AT, 25 July-2 Aug. 2013, Document: JCTVC-N0256 (hereinafter JCTVC-N0256). Firstly, the Intra MC is extended to support 2-D MVs, so that both MV components can be non-zero at the same time. This provides more flexibility to Intra MC than the original approach, where the MV is restricted to be strictly horizontal or vertical.
In JCTVC-N0256, two MV coding methods were disclosed:
Method 1—Motion vector prediction. The left or above MV is selected as the MV predictor and the resulting motion vector difference (MVD) is coded. A flag is used to indicate whether the MVD is zero. When MVD is not zero, exponential-Golomb codes of the 3rd order are used to code the remaining absolute level of the MVD. Another flag is used to code the sign.
Method 2: NoMotion vector prediction. The MV is coded using the exponential-Golomb codes that are used for MVD in HEVC.
Another difference disclosed in JCTVC-N0256 is that the 2-D Intra MC is further combined with the pipeline friendly approach:
1. No interpolation filters are used,
2. MV search area is restricted. Two cases are disclosed:                a. Search area is the current CTU and the left CTU or        b. Search area is the current CTU and the rightmost 4 column samples of the left CTU.        
Among the proposed methods in JCTVC-N0256, the 2-D Intra MC, the removal of interpolation filters, and the search area constraint to the current CTU and the left CTU have been adopted in a new version draft standard. The CU level syntax corresponding to JCTVC-N0256 has been incorporated in High Efficiency Video Coding (HEVC) Range Extension text specification: Draft 4 (RExt Draft 4)(Flynn, et al., Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 14th Meeting: Vienna, AT, 25 Jul.-2 Aug. 2013, Document: JCTVC-N1005).
Furthermore, full-frame IntraBC has been disclosed in JCTVC-Q0031 (Draft text of screen content coding technology proposal by Qualcomm, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 17th Meeting: Valencia, ES, 27 Mar.-4 Apr. 2014, Document: JCTVC-Q0031) and JCTVC-Q0035 (Description of screen content coding technology proposal by Microsoft, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 17th Meeting: Valencia, ES, 27 Mar.-4 Apr. 2014, Document: JCTVC-Q0035). Full-frame IntraBC removes the search area constraints to further improve the coding efficiency of IntraBC. Therefore, all of the reconstructed blocks can be referenced by current CU, which introduces the data dependency between current CU and all of previous coded CUs. While full-frame IntraBC outperform the original IntraBC, the data dependency prevents from the use of parallel processing during the decoding process, especially for enabling tile process or wavefront parallel process (WPP) in HEVC.
Palette Index Map Scan Order
In SCM-2.0 palette mode coding, the traverse scan is used for index map coding as shown in FIG. 3. FIG. 3 shows a traverse scan for an 8×8 block. In traverse scan, the scan for even rows is from left to right, and the scan for odd rows is from right to left when the scanning order is horizontal. The traverse scan can also be applied in the vertical direction, where the scan is from top to bottom for even columns and from bottom to top for odd columns. The traverse scan is applied for all block sizes in palette mode.
It is desirable to develop methods for further improving the coding efficiency or lower the complexity for syntax elements generated in the palette mode.