Video compression systems employ block processing for most of the compression operations. A block is a group of neighboring pixels and may be treated as one coding unit in terms of the compression operations. Theoretically, a larger coding unit is preferred to take advantage of correlation among immediate neighboring pixels. Various video compression standards, e.g., Motion Picture Expert Group (MPEG)-1, MPEG-2, and MPEG-4, use block sizes of 4×4, 8×8, and 16×16 (referred to as a macroblock (MB)).
High efficiency video coding (HEVC) is also a block-based hybrid spatial and temporal predictive coding scheme. HEVC partitions an input picture into square blocks referred to as coding tree units (CTUs) as shown in FIG. 1. Unlike prior coding standards, the CTU can be as large as 128×128 pixels. Each CTU can be partitioned into smaller square blocks called coding units (CUs). FIG. 2 shows an example of a CTU partition of CUs. A CTU 100 is first partitioned into four CUs 102. Each CU 102 may also be further split into four smaller CUs 102 that are a quarter of the size of the CU 102. This partitioning process can be repeated based on certain criteria, such as limits to the number of times a CU can be partitioned may be imposed. As shown, CUs 102-1, 102-3, and 102-4 are a quarter of the size of CTU 100. Further, a CU 102-2 has been split into four CUs 102-5, 102-6, 102-7, and 102-8.
Each CU 102 may include one or more blocks, which may be referred to as prediction units (PUs). FIG. 3 shows an example of a CU partition of PUs. The PUs may be used to perform spatial prediction or temporal prediction. A CU can be either spatially or temporally predictive coded. If a CU is coded in intra mode, each PU of the CU can have its own spatial prediction direction. If a CU is coded in inter mode, each PU of the CU can have its own motion vector(s) and associated reference picture(s).
In HEVC, motion vectors (MVs) are predictively coded in a temporal prediction process. For a current PU having one current motion vector and an associated reference index, a motion vector predictor (MVP) is derived from motion vectors of a group of candidates including spatially neighboring or temporally collocated PUs of the current PU. The difference between the current motion vector and the MVP is then determined and coded. This reduces overhead as only the difference is sent instead of information for the current motion vector. Also, when in merge mode, a single motion vector may be applied to a group of spatially neighboring or temporally collocated PUs. Using the same motion vector for a group of PUs also saves overhead. However, the encoder still needs to encode information to indicate to the decoder which temporally collocated PU was selected.