Video encoders often apply motion-compensated prediction in order to reduce the amount of data to encode by exploiting temporal correlation between successive video frames. Motion-compensated prediction is the process of describing a current video frame in terms of a transformation of a reference video frame. Motion-compensated prediction is based on the fact that often the only difference between one video frame and another is the result of the camera moving or an object in the frame moving. This means that much of the information that represents one frame will be the same as the information used in the next frame.
Motion-compensated prediction consists of finding, for each block in the current frame, the “best possible” match within a reference frame. However, searching the entire reference frame is prohibitively expensive in terms of computational complexity and memory bandwidth. Accordingly, practical implementations of software and hardware video encoders search a selected area, i.e. a search range, that lies within the reference frame around a predicted motion vector that is computed based on previously encoded blocks.
Moreover, in many cases, the encoder does not contain enough memory to store the entire reference frame. Accordingly, in practice, a video encoder typically stores only a subset of the reference frame, i.e., a search window. This search window is typically centered on the correspondingly positioned block in the reference frame, i.e. the collocated block. The predicted motion vector is then restricted to stay inside this search window. The searched area is the overlapping region between the search window and the search range.
The larger the search window, the more expensive the video encoding process becomes in terms computational complexity and memory bandwidth. Specifically, the larger the search window, the larger the required memory footprint, and the higher the cost of a hardware implementation of a video encoder.
On the other hand, for fast motion frames, a smaller search window may lead to the failure of the motion-compensation prediction procedure to efficiently capture the motion since the object would very likely move outside the search window. In practice, this would result in encoding the current block as an intra-predicted block or an inter-predicted block with high-energy residuals. In both cases, the Rate-Distortion (R-D) performance of the encoder will be severally affected. As a result, a higher bit-rate would be required to encode the video frames.