In High Efficiency Video Coding (HEVC), a merge mode for Inter-picture prediction is introduced. A merge candidate list of candidate motion parameters from neighboring blocks is constructed. Then an index is signaled which identifies the candidates to be used. Merge mode also allows for temporal prediction by including into the list a candidate obtained from previously coded pictures. In HEVC, the merge candidates list is constructed based on the following candidates: up to four spatial merge candidates that are derived from five spatial neighboring blocks, and additional merge candidates including combined bi-predictive candidates and zero motion vector candidates.
In HEVC, a skip mode is used to indicate for a block that the motion data is inferred instead of explicitly signaled ad that the prediction residual is zero, i.e. no transform coefficients are transmitted. In HEVC, at the beginning of each CU in an inter-picture prediction slice, a skip_flag is signaled that implies the following: the CU only contains one PU (2N×2N), the merge mode is used to derive the motion data, and no residual data is present in the bitstream.
In current VVC (Versatile Video Coding) development, some new merge candidates were introduced. The sub-CU modes are enabled as additional merge candidates and there is no additional syntax element required to signal the modes. Two additional merge candidates are added to merge candidates list of each CU to represent the ATMVP mode and STMVP mode. Up to seven merge candidates are used, if the sequence parameter set indicates that ATMVP and STMVP are enabled. The encoding logic of the additional merge candidates is the same as for the merge candidates in the HEVC, which means, for each CU in P or B slice, two more RD checks are needed for the two additional merge candidates. In JEM, the order of the inserted merge candidates is A, B, C, D, ATMVP, STMVP, E (when the merge candidates in the list are less than 6), TMVP, combined bi-predictive candidates and zero motion vector candidates.
In the current VVC, all bins of merge index are context coded by CABAC. While in HEVC, only the first bin is context coded and the remaining bins are context by-pass coded. In the current VVC, the maximum number of merge candidates are 7.
In some techniques, a scheme searches the candidate motion vectors from previously coded blocks, with a step size of 8×8 block. It defines the nearest spatial neighbors, i.e., immediate top row, left column, and top-right corner, as category 1. The outer regions (maximum three 8×8 blocks away from the current block boundary) and the collocated blocks in the previously coded frame are classified as category 2. The neighboring blocks that are predicted from different reference frames or are intra coded are pruned from the list. The remaining reference blocks are then each assigned a weight. The weight is related to the distance to the current block.
The additional merge candidates will be a direct extension of the NEXT merge candidates. The left, above, left bottom, above right, and top left candidates that are not immediately next to the current block are checked.
As an example, a top left corner of a reference block has an offset of (−96, −96) to the current block. Each candidate Bi or Ci has an offset of height in the vertical direction compared to its previous B or C candidates. Each candidate Ai or Di has an offset of block width in the horizontal direction compared to its previous A or D candidates. Each Ei has an offset of width and height in both horizontal direction and vertical direction compared to its previous E candidates. The candidates are checked from inside to the outside. And the order of the candidates is Ai, Bi, Ci, Di, and Ei.