High Efficiency Video Coding (HEVC) is a video compression standard. In HEVC, the basic processing unit is called a coding tree unit (CTU) and it can be as large as 64×64 luma samples. A CTU can be split into multiple coding units (CU) in a quad-tree fashion; these CUs can have sizes varying from 8×8 to 64×64. Each CU can be coded either as an intra-picture prediction (INTRA) CU or as an inter-picture prediction (INTER) CU. Thus, the CU is the basic unit for forming the prediction.
When a CU is determined as INTER, the corresponding CU can be sub-divided into prediction units (PU) or can stay as a single unit. There are several PU division types: one is dividing the CU horizontally into two rectangular PUs, another division type is dividing vertically into two rectangular PUs, or dividing into four equal-sized square PUs.
A motion vector is associated with each PU. The motion vector for each PU is usually determined using a motion estimation technique. Hierarchical Motion Estimation is a motion estimation (ME) algorithm using a block matching approach to find a good motion vector for each of the blocks (PUs) with a lower complexity than a Full Search (FS) algorithm, which considers all the candidates available by searching all the points in the search area. Generally, fast algorithms with reduced computational complexity show degraded performance compared to a FS. Thus, there is the need for a fast algorithm with reduced computational complexity, but which still accomplishes high compression efficiency while maintaining sufficient quality.
The main idea of Hierarchical Motion Estimation is to start performing motion estimation searches at a lower image resolution than the target resolution image and increase the accuracy at different steps. The lower resolution image represents the characteristics of the image that is under analysis.
To better understand Hierarchical ME, consider an example of searching a motion vector for a 32×32 block. For a block size of 32×32, initial motion estimation is performed by down-sampling by a factor of 4 the original image 32×32 block. This creates an 8×8 block that is searched in the down-sampled reference frame. This results in a motion vector with four-pixel accuracy because of the 4× down-sampling.
Assume that the motion vector from this step is found to be (12, 8), where 12 is the horizontal displacement and 8 is the vertical displacement. After this step the 4-pixel accurate motion vector could undergo a refinement search step where candidate 2-pixel-accurate motion vectors around the initial 4-pixel-accurate motion vector are evaluated. More specifically, the following motion vectors are searched for the above example: (10,6), (12,6), (14,6), (10,8), (12,8), (14,8), (10,10), (12,10), (14,10). At this refinement step, the resolution of the block being searched is increased to 16×16. The block is obtained by a 2× down-sampling process applied to the original image 32×32 block. Similarly, the resolution is increased to full-pixel, half-pixel and quarter-pixel accuracies and the final motion vector is found for the block with quarter-pixel accuracy.
In order to determine motion information for each PU within the CTU, the motion estimation process needs to be done for different block sizes. If motion estimation is done exhaustively for all block sizes, the additional complexity becomes very significant.
Therefore, novel methods need to be developed where motion information for different block sizes of the CTU is determined with high coding efficiency but with low complexity.