It is frequently necessary to interpolate a new image at some position within a sequence of images that does not align with an existing image in the sequence. A very common example is temporal interpolation. Temporal interpolation is performed whenever a representation of a video frame is required corresponding to a time instant that is not present in the input sequence. Examples of applications of temporal interpolation include: conversion of a video sequence to a new frame or field rate without changing the speed of any motion present in the scene, slow-motion effects where additional frames are created for playout at the original frame rate, or any combination of those processes that does not amount to simply playing out the input frames one by one at a different rate.
Another application of interpolation within a image sequence is where the sequence comprises a sequence of views of a common scene from different positions, and an interpolated image is created that represents the view of the scene from a position between two of the existing viewpoints. In this specification temporal interpolation will be described, however the skilled person will appreciate that the invention is equally applicable to ordered image sequences in general.
In this specification the term ‘position’ will be used to describe the position of an image in an ordered sequence of images. This may be a position in time or a position in whatever dimension defines the relationship between the images of the sequence; for example it could be a sequence of viewpoints along a path, which may or may not have associated time values. To avoid confusion, the term ‘spatial position’ will be used to indicate position within an image.
Examples of known techniques for temporal interpolation will now be described. FIG. 1 shows one-dimensional sections through frames in a sampled video sequence with time running horizontally in the diagram. Frames 101 and 102 are input frames, and frame 103 represents an output frame interpolated at a time instant 60% of the input frame period after the first input frame. In the remainder of this document, the desired output frame time is specified relative to the times of the two adjacent input frames as “display phase” on a scale from 0 to 1. Typically a regular sequence of output images is required, and, as is well known in the art, the display phase of each interpolated output image will differ from the display phase of the preceding interpolated output image by a phase increment that depends on the difference between the temporal sampling rates of the input and output image sequences.
In this example, the display phase is 0.6. In “non-motion-compensated” interpolation, a particular sample (pixel) (104) of the output frame may be derived from corresponding samples (105) and (106) in the input frames. Suitably, linear interpolation would be used, in which the value of output sample (104) would be equal to the sum of 40% of input sample (105) and 60% of input sample (106).
Linear interpolation as shown in FIG. 1 gives acceptable results unless there is significant movement of detailed objects in the scene; in this case the output frames become blurred, or a double image becomes apparent, as the result of the relatively-displaced contributions from the two input frames.
Motion compensated interpolation, illustrated in FIG. 2, is a well known way of overcoming those problems. Referring to FIG. 2, pixel (205) in input frame 201 has associated with it a forward motion vector (207). Similarly, pixel (206) in input frame 202 has associated with it a backward motion vector (208). Both input pixels are ‘projected’ onto the interpolated output frame 203 in the direction of their respective motion vectors, and contribute through a weighted sum or other method to the value of pixel (204) in the output frame.
In the projection of pixels from their respective spatial positions in input frames to their motion compensated spatial positions in output frames, the magnitudes of the respective motion vectors are scaled in proportion to the phase difference between the output frame and the respective contributing input frame. Thus input pixel (205) is shifted by 0.6 of the motion vector (207); and, input pixel (206) is shifted by 0.4 of the motion vector (208). Various methods exist to solve the problems that arise when particular output pixel locations either have no motion vectors pointing to them, or have vectors pointing to them from more than one location in an input frame. For example, International Patent Application No. WO 2004/025958 “Improved Video Motion Processing” describes a method of assigning weights to contributing pixels.
Occasionally, a frame ‘built’ by motion compensation may suffer from impairments. These can arise, for example: where the speed or complexity of the motion in the scene is too high for the motion estimator; where there is a significant incidence of transparent content in the scene; where there are significant changes in illumination between one frame and the next; or, where the input frames are corrupted by noise. Such impairments may sometimes be more annoying than the blur or double images produced by linear interpolation. For this reason, motion compensated interpolation systems may employ “fallback processing” in which a linearly interpolated value may be switched or mixed into the output in response to a confidence measure.
FIG. 3 illustrates a motion compensated temporal interpolator employing fallback processing. An input video signal (301) is applied to a motion compensated interpolation process (302) to produce a motion compensated output (303). The time of the output frame is determined by the display phase signal (311). A confidence measurement process (304) uses the input signal (301) and information (305) from the motion compensation process (302) to generate a switching signal (306). A linear interpolation process (307) is also carried out on the input signal (301) in accordance with the display phase (311) to produce linearly interpolated fallback frames (308). A mixing unit (309) mixes between the motion compensated interpolated frames (303) and the fallback frames (308) according to the switching signal (306) to produce a final output (310). The confidence measurement may be: pixel based, in which case the switching signal has high bandwidth; region based, in which case the switching or mixing signal has a lower bandwidth; or, frame based, in which a uniform decision about the degree of fallback processing is made across the whole frame.
Fallback processing using linear interpolation can be satisfactory but has several potential drawbacks. If, on the one hand, the control signal has too high a bandwidth, the artefacts introduced by the switching process can sometimes be more disturbing than the original artefacts. If, on the other hand, the control signal has low bandwidth, significant areas of the picture may be processed using the linear fallback mode when they would have benefited from motion compensated processing. Furthermore, linearly interpolated pictures, while sometimes acceptable in real-time display, show obvious impairments if individual frames of the output are viewed in isolation.
An alternative method of fallback processing is taught in International Patent Application WO 2011/073693. This method is applicable where the change in frame rate desired in the converter is small, for example when converting from 24 Hz film to 25 Hz video. The fallback processing, switched in when confidence in the motion estimation is low, consists of playing out input frames synchronized to the output frame rate, a process which can be maintained without dropping or repeating frames, for a limited time depending on the capacity of a frame-store buffer in the system. This method can be very effective, producing an unimpaired fallback signal, but is limited to close-frame-rate conversion and to short-lived dips in the level of confidence.