In video coding systems, spatial and temporal redundancy is exploited using spatial and temporal prediction to reduce the information to be transmitted. The spatial and temporal prediction utilizes decoded pixels from the same picture and reference pictures respectively to form prediction for current pixels to be coded. In a conventional coding system, side information associated with spatial and temporal prediction may have to be transmitted, which will take up some bandwidth of the compressed video data. The transmission of motion vectors for temporal prediction may require a noticeable portion of the compressed video data, particularly in low-bitrate applications. Accordingly, motion vector prediction has been widely used in the field to reduce bitrate corresponding to the motion vector coding.
High-Efficiency Video Coding (HEVC) is a new international video coding standard that is being developed by the Joint Collaborative Team on Video Coding (JCT-VC). HEVC is based on the hybrid block-based motion-compensated DCT-like transform coding architecture. The basic unit for compression, termed Coding Unit (CU), is a 2N×2N square block, and each CU can be recursively split into four smaller CUs until a predefined minimum size is reached. Each CU contains one or multiple Prediction Units (PUs). The PU sizes can be 2N×2N, 2N×N, 2N×nU, 2N×nD, N×2N, nL×2N, nR×2N, or N×N, where 2N×N, 2N×nU, 2N×nD and N×2N, nL×2N, nR×2N correspond to horizontal and vertical partition of a 2N×2N PU with symmetric or asymmetric PU size division respectively.
To further increase the coding efficiency of motion vector coding in HEVC, the motion vector competition (MVC) based scheme is applied to select one motion vector predictor (MVP) among a given MVP candidate set which includes spatial and temporal MVPs. There are three inter-prediction modes including Inter, Skip, and Merge in the HEVC test model version 3.0 (HM-3.0). The Inter mode performs motion-compensated prediction with transmitted Motion Vector Differences (MVDs) that can be used together with MVPs for deriving motion vectors (MVs).
The Skip and Merge modes utilize motion inference methods (MV=MVP+MVD where MVD is zero) to obtain the motion information from spatial neighboring blocks (spatial candidates) or temporal blocks (temporal candidates) located in a co-located picture. The co-located picture is the first reference picture in list 0 or list 1, which is signaled in the slice header.
When a PU is coded in either Skip or Merge mode, no motion information is transmitted except for the index of the selected candidate. In the case of a Skip PU, the residual signal is also omitted. For the Inter mode in HM-3.0, the Advanced Motion Vector Prediction (AMVP) scheme is used to select a motion vector predictor among an AMVP candidate set including two spatial MVPs and one temporal MVP. As for the Merge and Skip mode in HM-3.0, the Merge scheme is used to select a motion vector predictor among a Merge candidate set containing four spatial MVPs and one temporal MVP.
For the Inter mode, the reference picture index is explicitly transmitted to the decoder. The MVP is then selected among the candidate set for a given reference picture index. FIG. 1 illustrates the MVP candidate set for the Inter mode according to HM-3.0, where the MVP candidate set includes two spatial MVPs and one temporal MVP:                1. Left predictor (the first available MV from A0 and A1),        2. Top predictor (the first available MV from B0, B1, and Bn+1), and        3. Temporal predictor (the first available MV from TBR and TCTR).        
A temporal predictor is derived from a block (TBR or TCTR) in a co-located picture, where the co-located picture is the first reference picture in list 0 or list 1. The block associated with the temporal MVP may have two MVs: one MV from list 0 and one MV from list 1. The temporal MVP is derived from the MV from list 0 or list 1 according to the following rule:                1. The MV that crosses the current picture is chosen first, and        2. If both MVs cross the current picture or both do not cross, the MV with the same reference list as the current list will be chosen.        
In HM-3.0, if a particular block is encoded in the Merge mode, an MVP index is signaled to indicate which MVP among the MVP candidate set is used for this block to be merged. To follow the essence of motion information sharing, each merged PU reuses the MV, prediction direction, and reference picture index of the selected candidate. It is noted that if the selected MVP is a temporal MVP, the reference picture index is always set to the first reference picture. FIG. 2 illustrates the MVP candidate set for the Merge mode according to HM-3.0, where the MVP candidate set includes four spatial MVPs and one temporal MVP:                1. Left predictor (Am),        2. Top predictor (Bn),        3. Temporal predictor (the first available MV from TBR or TCTR),        4. Above-right predictor (B0), and        5. Below-left predictor (A0).        
In HM-3.0, a process is utilized in both Inter and Merge modes to avoid an empty candidate set. The process adds a candidate with a zero MV to the candidate set when no candidate can be inferred in the Inter or Merge mode.
Based on the Rate-Distortion Optimization (RDO) decision, the encoder selects one final MVP within a given MVP candidate set for the Inter, Skip, or Merge mode and transmits the index of the selected MVP to the decoder after removing the redundant candidates. In the AMVP scheme, temporal motion predictor is included in the candidate set of motion vector predictors (MVPs) to improve the coding efficiency. However, there is also a drawback of using temporal motion prediction since any parsing error associated with the temporal motion predictor may cause severe error propagation. When a motion vector of a previous picture cannot be decoded correctly, a mismatch between the candidate set on the encoder side and that on the decoder side may occur. This mismatch may result in parsing error of the index of the best MVP candidate and cause the rest of the current picture parsed or decoded erroneously. Furthermore, this parsing error can affect subsequent inter pictures that allow temporal MVP candidates.
In HEVC, a process is developed to compress the memory associated with MV information in a coded picture for temporal MVPs. The process of memory compression for MV information is termed Motion Data Storage Reduction (MDSR) in HEVC. In this method, MV data of one block in an MDSR unit will be used as the representative MV data for the entire MDSR unit, and all the MV data of other blocks in the MDSR will be discarded.
The AMVP scheme derives MVP candidates from neighboring blocks in the same picture as well as co-located blocks from reference pictures. With the availability of these MVP candidates, a better prediction, i.e., smaller MVD, for an underlying MV may be achieved. However, during the AMVP process, an MVP may be redundant under certain circumstances. It is desirable to remove the redundancy in motion vector prediction to reduce complexity and/or to improve performance.