The prevalence of ever more portable and higher resolution cameras has led to a growing desire to send and receive video. At the same time portable and connected media players are a part of many devices, including smart phones and watches. Home viewers and mobile users are being provided with high quality video on demand but this requires a significant part of the transmission ability of wireless and wired date services. As a result, there is pressure for more efficient and higher quality video compression and decompression techniques.
Newer coding standards allow perceptually better video to be transmitted using less data. MPEG (Motion Pictures Experts Group) standards are being replaced with newer techniques such as H.265/HEVC (High Efficiency Video Coding) from the Joint Collaborative Team on Video Coding (JCT-VC), VP9 of Google/YouTube, and AV1 (AOMedia Video 1) of AOpen (Alliance for Open Media). These and other codecs present different characteristics for different markets but also use many of the same principles.
The VP9 standard, among others, allows multiple difference reference frames to be used when coding a particular frame. Frames are encoded in blocks that represent groups of pixels. In order to reduce the information required for a block, the block may be described with respect to how it differs from a nearby block of the same frame (spatial) or with respect to how it differs from a corresponding block in a previous frame (temporal). In some encoders, the description of this difference is referred to as a motion vector. The motion vector expresses how a block has changed or moved from the reference block. Typically, the motion vector is a two-dimensional vector used for inter prediction which refers the current frame to the reference frame. The motion vector value provides the coordinate offsets from a location in the current frame to a location in the reference frame. With any motion vector there is an express or implied reference index for each block to indicate which previous reference frame from a reference frame list is to be used for predicting the current block.
Temporal scalability is another characteristic of many video codecs, such as VP9 or HEVC, each frame is assigned to a particular temporal layer, and cannot refer to a reference frame from a higher temporal layer than the one to which it is assigned. The restriction against using frames from higher temporal layers allows higher temporal layer frames to be discarded by a decoder or a Media Aware Network Element (MANE) without impacting the decoding of lower temporal layers. Discarding frames reduces the network bandwidth required to deliver the bitstream, the computation resource required to decode the frames in those layers, and the buffer requirements to keep frames. Extra frames can require significant computational and memory resources for small devices or for high resolution video.
Temporal motion vector prediction is a technique used in a decoder to predict a motion vector (MV) in a current frame by using data about the corresponding motion vector from the previous frame. Motion vectors are used extensively in many video codecs including AV1, VP9, and HEVC. When coding the motion vector for a particular block, the motion vector can be predicted from a coded MV in a previously coded frame. On the decoder, however, the referenced previous frame must be available so that the decoder can find the previously coded MV and use it to decode the current frame.