1. Field
The embodiments described herein relates generally to digital video compression and, more particularly, to a method and apparatus for Encoder Assisted-Frame Rate Up Conversion (EA-FRUC) for video compression.
2. Background
Video formats supporting various frame rates exist today. The following formats are currently the most prevalent, listed in order by their supported frames per second (fps): 24 (film native), 25 (PAL), 30 (typically interlaced video), and 60 (High Definition (HD) e.g. 720p). Although these frame rates are suitable for most applications, to reach the low bandwidth required for mobile handset video communications, frame rates are sometimes dropped to rates as low as 15, 10, 7.5, or 3 fps. Although these low rates allow low end devices with lower computational capabilities to display some video, the resulting video quality suffers from “jerkiness” (i.e., having a slide show effect), rather than being smooth in motion. Also, the frames dropped often do not correctly track the amount of motion in the video. For example, fewer frames should be dropped during “high motion” video content portions such as those occurring in sporting events, while more frames may be dropped during “low-motion” video content segments such as those occurring in talk shows. Video compression needs to be content dependent, and it would be desirable to be able to analyze and incorporate motion and texture characteristics in the sequence to be coded so as to improve video compression efficiency.
Frame Rate Up Conversion (FRUC) is a process of using video interpolation at the video decoder to increase the frame rate of the reconstructed video. In FRUC, interpolated frames are created using received frames as references. Currently, systems implementing FRUC frame interpolation include approaches based on motion compensated interpolation and the processing of transmitted motion vectors. FRUC is also used in converting between various video formats. For example, in Telecine and Inverse Telecine applications, which is a film-to-videotape transfer technique that rectifies the respective color frame rate differences between film and video, progressive video (24 frames/second) is converted to NTSC interlaced video (29.97 frames/second).
Another FRUC approach uses weighted-adaptive motion compensated interpolation (WAMCI), to reduce the block artifacts caused by the deficiencies of motion estimation and block based processing. This approach is based on an interpolation by the weighted sum of multiple motion compensated interpolation (MCI) images. The block artifacts on the block boundaries are also reduced in the proposed method by applying a technique similar to overlapped block motion compensation (OBMC). Specifically, to reduce blurring during the processing of overlapped areas, the method uses motion analysis to determine the type of block motion and applies OBMC adaptively. Experimental results indicate that the proposed approach achieves improved results, with significantly reduced block artifacts.
Yet another FRUC approach uses vector reliability analysis to reduce artifacts caused by the use of any motion vectors that are inaccurately transmitted from the encoder. In this approach, motion estimation is used to construct motion vectors that are compared to transmitted motion vectors so as to determine the most desired approach for frame interpretation. In conventional up-conversion algorithms using motion estimation, the estimation process is performed using two adjacent decoded frames to construct the motion vectors that will allow a frame to be interpolated. However, these algorithms attempt to improve utilization of transmission bandwidth without regard for the amount of calculation required for the motion estimation operation. In comparison, in up-conversion algorithms using transmitted motion vectors, the quality of the interpolated frames depends largely on the motion vectors that are derived by the encoder. Using a combination of the two approaches, the transmitted motion vectors are first analyzed to decide whether they are usable for constructing interpolation frames. The method used for interpolation is then adaptively selected from three methods: local motion-compensated interpolation, global motion-compensated interpolation and frame-repeated interpolation.
Although FRUC techniques are generally implemented as post-processing functions in the video decoder, thus the video encoder is typically not involved in this operation. However, in an approach referred to as encoder-assisted FRUC (EA-FRUC), the encoder can determine if transmission of certain information related to motion vectors or references frames (e.g., residual data), may be eliminated while still allowing the decoder to autonomously regenerate major portions of frames without the eliminated vector or residual data. For example, a bidirectional predictive video coding method has been introduced as an improvement to B-frame coding in MPEG-2. In this method, the use of an error criterion is proposed to enable the application of true motion vectors in motion-compensated predictive coding. The distortion measure is based on the sum of absolute differences (SAD), but this distortion measure is known to be insufficient in providing a true distortion measure, particularly where the amount of motion between two frames in a sequence needs to be quantified. Additionally, the variation in thresholds are classified using fixed thresholds when, optimally, these thresholds should be variable as the classifications are preferably content dependent.
The field-of-study of EA-FRUC is a growing field. With an increased interest in this area of video compression—particularly for low bit-rate applications like streaming video and video telephony, and especially in scenarios where the sender is at a network node, which is capable of supporting high complexity applications, and the receiver is a handheld with power and complexity constraints. EA-FRUC also finds application in open systems, where the decoder conforms to any standard or popular video coding technology, and in closed systems, where proprietary decoding techniques can be adopted.
What is desirable is an approach that provides high quality interpolated frames at the decoder while decreasing the amount of bandwidth needed to transmit the information needed to perform the interpolation and also decreasing the volume of calculation needed to create these frames so as to make it well suited to multimedia mobile devices that depend on low-power processing.
Accordingly, there is a need to overcome the issues noted above.