A video sequence comprises a plurality of frames which are to be played out sequentially. The frame rate of a video sequence indicates the rate at which the frames are to be played out in order to correctly play the video sequence. For example, a video sequence may be a film having a frame rate of 24 frames per second. As another example, a video sequence may have a frame rate of 50 or 60 frames per second (e.g. for television broadcast). Other video sequences may have other frame rates. Each frame of the video sequence comprises a plurality of pixels which form an image. For example, a frame of a High Definition video sequence may be an image formed by an array of pixel values at each of 1920×1080 possible pixel locations.
In other examples pixel values may exist at some, but not all, of the possible pixel locations in any one frame. For example, in an interlaced system, pixel values may exist for alternate rows of the possible pixel locations, such that a partial image is formed. These partial images may be known as “fields”, and two fields, often sampled at different times, comprise a complete frame. In these other examples, multiple partial images (or “fields”) may be used to determine complete images (or “frames”), e.g. by a process called de-interlacing.
For clarity, the following description describes systems operating on complete frames. All of the methods described may equally be applied to video sequences comprising fields or complete frames, and the use of the term “frame” should be understood to refer to either complete frames or fields as appropriate.
A frame rate converter may be used to alter the frame rate of a video sequence. A process of frame rate conversion applied by a frame rate converter may include adding frames into the video sequence and/or removing frames from the video sequence. In a simple example, a frame rate converter may double the frame rate of a video sequence (e.g. from 24 frames per second to 48 frames per second) by inserting a frame between each pair of existing frames in the video sequence. In one example, each of the frames which are inserted into the video sequence may simply be a copy of one of the existing frames, e.g. such that each frame of the existing video sequence is played out twice in a row, but at twice the speed of the original video sequence. In this example, the perceptual smoothness of the video sequence might not be significantly improved by doubling the frame rate, but this frame rate conversion does allow the video sequence, which originally has one frame rate, to be outputted at a different frame rate (e.g. when a film is broadcast on a television signal).
More complex frame rate converters attempt to determine what a frame would look like at a point in time between two of the existing frames to thereby generate a new frame for inclusion in the video sequence between the two existing frames. For example, motion estimation may be used to track the way that parts of an image move between one frame and the next. A common motion estimator is the block-based type, in which a frame of a video sequence is divided into a number of blocks, and for each block a vector (referred to as a “motion vector”) is found that represents the motion of the pixels in that block. In one example, the determination of the motion vector for a block of a current frame involves searching the previous frame in the video sequence to find the area of image data of the previous frame with contents that are most similar to the contents of the block of the current frame. Other factors may also be involved in the determination of the motion vector for a block. The motion vectors can be used to produce an interpolated frame at an intermediate position (given by a temporal phase, ϕ) between two existing frames in a video sequence. For example, if the interpolated frame is to be included at the mid-point between two adjacent existing frames (i.e. if the temporal phase, ϕ, of the interpolated frame is 0.5) then the motion vectors determined between the two existing frames may be halved (i.e. multiplied by the temporal phase, ϕ) and then used to determine how the image in one of the existing frames should be changed for use in representing the interpolated frame.
Problems can occur when an interpolated frame is predicted using motion vectors. In particular, it is often difficult to determine accurate motion vectors for occluded and/or revealed areas of the interpolated frame. Occluded and revealed areas occur where different objects have different magnitudes and/or directions of motion between two successive frames of a video sequence. Inaccurate motion vectors may then result in visible artefacts as pixel data is interpolated into incorrect locations in the interpolated frame. The distribution of these artefacts around the edges of moving objects, or near where there is a change in the motion, has a distinctive appearance that is often referred to as a “halo”. A halo may take many different forms depending upon the exact process used to predict the interpolated frame. For example, the halo may include sharp “rip and tear” artefacts with sharp edges which can give the appearance of blockiness in the rendered image. Perceptually, the distortion to the video sequence may be cumulative, such that if many of the interpolated frames of the video sequence (e.g. there may be one, two, three or more interpolated frames for every one of the existing original frames of the video sequence) include halo artefacts then the video sequence may appear to have more distortion than the apparent distortion of each of the interpolated frames when viewed separately. The halo artefacts may appear to move as the video sequence is played out, and the apparent movement of the halo artefacts may draw a viewer's attention to the distortion caused by the halo artefacts.