In High Efficiency Video Coding (HEVC), a merge mode for inter-picture prediction is introduced. A merge candidate list of candidate motion parameters from neighboring blocks is generated. Then, an index is signaled which identifies the candidates to be used. Merge mode also allows for temporal prediction by including in the list a candidate obtained from previously coded pictures. Referring to FIG. 1, in HEVC, a merge candidates list for a current block (100) is generated based on one or more spatial merge candidates (101), (102), (103), (104), and/or (105), one temporal merge candidate derived from two temporal co-located blocks, and/or additional merge candidates including combined bi-predictive candidates and zero motion vector candidates.
In HEVC, a skip mode is used to indicate for a block that the motion data is inferred instead of explicitly signaled and that the prediction residual is zero, i.e., no transform coefficients are transmitted. In HEVC, at the beginning of each coding unit (CU) in an inter-picture prediction slice, a skip_flag is signaled that implies the following: the CU only contains one prediction unit (PU) (e.g., 2N×2N), the merge mode is used to derive the motion data, and/or no residual data is present in the bitstream.
In Joint Exploration Model 7 (JEM 7), which is the test model software studied by Joint Video Exploration Team (JVET), some new merge candidates are introduced. The sub-CU modes are enabled as additional merge candidates, and there is no additional syntax element required to signal the modes. Two additional merge candidates are added to the merge candidates list of each CU to represent the alternative-temporal motion vector prediction (ATMVP) mode and the spatial-temporal motion vector prediction (STMVP) mode.
Up to seven merge candidates are used, if the sequence parameter set indicates that ATMVP mode and STMVP mode are enabled. The encoding logic of the additional merge candidates is the same as for the merge candidates in HEVC, which means, for each CU in predicted (P) or bi-directional predicted (B) slice, two more rate-distortion (RD) checks are needed for the two additional merge candidates. In JEM 7, the order of the inserted merge candidates is A, B, C, D, ATMVP, STMVP, E (when the merge candidates in the list are less than 6), temporal motion vector prediction (TMVP), combined bi-predictive candidates and zero motion vector candidates.
In JEM 7, all bins of merge index are context coded by context-adaptive binary arithmetic coding (CABAC). While in HEVC, only the first bin is context coded and the remaining bins are context by-pass coded. In the JEM, the maximum number of merge candidates are 7.
FIG. 2 illustrates an example of the merge candidate list generation. For example, the scheme searches the candidate motion vectors from previously coded blocks, with a step size of 8×8 blocks. It defines the nearest spatial neighbors of a current block (200), i.e., immediate top row (201), left column (202), and top-right corner (203), as category 1. Other neighbors (204), (205), such as the outer regions (maximum three 8×8 blocks away from the current block boundary) and the collocated blocks in the previously coded frame are classified as category 2. The neighboring blocks that are predicted from different reference frames or are intra coded are pruned from the list. The remaining reference blocks are then each assigned a weight. The weight is related to the distance to the current block.
In an extended merge mode, the additional merge candidates will be a direct extension of the NEXT merge candidates. The left, above, left bottom, above right, and top left candidates that are not immediately next to the current block are checked. The detailed positions that are checked are shown in FIG. 1. A maximum number of merge candidates might be 10, as an example.
FIG. 3 illustrates a merge candidate from an outer region. For example, as shown in FIG. 3, the top left corner of the reference block has an offset of (−96, −96) to the current block. As shown by candidates (301), (302), and (303), each candidate B (i, j) or C (i, j) has an offset of 16 in the vertical direction compared to its previous B or C candidates. As shown by candidates (304), (305), and (306), each candidate A (i, j) or D (i, j) has an offset of 16 in the horizontal direction compared to its previous A or D candidates. As shown by candidates (307), (308), and (309), each candidate E (i, j) has an offset of 16 in both the horizontal direction and the vertical direction as compared to its previous E candidates. The candidates are checked from inside to the outside, and the order of the candidates is A (i, j), B (i, j), C (i, j), D (i, j), and E (i, j).