The present invention relates to digital video processing, and in particular to frame rate conversion.
In a number of video applications, it is necessary to change the frame rate of a digital video sequence. This requires some form of interpolation in time between successive frames of the sequence. A standard way to perform frame rate conversion (FRC) includes detecting a structure of the video in the form of local motion vectors or sets of local directions of regularity in image contexts. Depending on the local structure that has been detected, the frame rate converter computes interpolated pixels.
A multiscale hierarchical motion estimation method is disclosed in “Hierarchical Model-Based Motion Estimation”, J. R. Bergen, et al., Proceedings of the 2nd European Conference on Computer Vision, May 1992, pages 237-252. Multiscale differential motion estimation methods are disclosed in “Bayesian Multi-Scale Differential Optical Flow”, E. P. Simoncelli, Handbook of Computer Vision and Applications, Vol. 2, chapter 14, Academic Press, San Diego, April 1999, pages 397-422, and in “Robust computation of optical flow in a multi-scale differential framework”, J. Weber and J. Malik, International Journal of Computer Vision, Vol. 2, 1994, pages 5-19.
All these methods allow to perform frame rate conversion based on motion compensation using a multiscale estimation method, and to provide a dense motion map at the final pixel or subpixel resolution. The accuracy of the motion estimation is not related to the needs of the interpolation process applied to perform frame rate conversion.
A frame rate converter is commonly implemented in an application-specific integrated circuit (ASIC) or a field-programmable gate array (FPGA). In such components, the internal memory is normally not large enough to store two full-resolution images. The processing is done in an order prescribed by the input and output interfaces of the FRC circuit, usually raster, striped or tiled. At any given time, the chip holds in memory a context of lines for doing the structure detection, and for computing interpolated pixel values.
A hard limitation affects most prior art FRC systems: supporting a range of vertical speeds or displacements [−Vy, Vy] between consecutive frames requires buffers covering more than 2×Vy+1 lines for each input frame. In addition the size of the logic required to handle with good visual quality a large range of speeds increases sharply with the range.
There is a need for an implementation of FRC processing with a good trade-off between quality of the converted video sequence and (i) cost of a hardware implementation in terms of internal memory and logic size of the component or (ii) complexity of a software implementation. Such need is particularly acute in the case of real-time applications of FRC.