This invention relates to a system and method for detecting artefacts in frames generated for inclusion in a video sequence.
A video sequence comprises a plurality of frames which are to be played out sequentially. The frame rate of a video sequence indicates the rate at which the frames are to be played out in order to correctly play the video sequence. For example, a video sequence may be a film having a frame rate of 24 frames per second. As another example, a video sequence may have a frame rate of 50 or 60 frames per second (e.g. for television broadcast). Other video sequences may have other frame rates. Each frame of the video sequence comprises a plurality of pixels which form an image. For example, a frame of a High Definition video sequence may for example be an image formed by an array of pixel values at each of 1920×1080 possible pixel locations.
In other examples pixel values may exist at some, but not all, of the possible pixel locations. For example, in an interlaced system, pixel values may exist for alternate rows of the possible pixel locations, such that a partial image is formed. These partial images may be known as “fields”, and two fields, often sampled at different times, comprise a complete frame. In these other examples, multiple partial images (or “fields”) may be used to determine complete images (or “frames”), e.g. by a process called de-interlacing.
For clarity, the following description describes systems operating on complete frames. All of the methods described may equally be applied to video sequences comprising fields or complete frames, and the use of the term “frame” should be understood to refer to either complete frames or fields as appropriate.
A frame rate converter may be used to alter the frame rate of a video sequence. A process of frame rate conversion applied by a frame rate converter may include adding frames into the video sequence and/or removing frames from the video sequence. In a simple example, a frame rate converter may double the frame rate of a video sequence (e.g. from 24 frames per second to 48 frames per second) by inserting a frame between each pair of existing frames in the video sequence. In one example, each of the frames which are inserted into the video sequence may simply be a copy of one of the existing frames, e.g. such that each frame of the existing video sequence is played out twice in a row, but at twice the speed of the original video sequence. In this example, the perceptual smoothness of the video sequence might not be significantly improved by doubling the frame rate, but this frame rate conversion does allow the video sequence, which originally has one frame rate, to be outputted at a different frame rate (e.g. when a film is broadcast on a television signal).
More complex frame rate converters attempt to determine what a frame would look like at a point in time between two of the existing frames of the video sequence to thereby generate a new frame for inclusion in the video sequence between the two existing frames. For example, a frame rate converter may contain a motion estimator that performs a motion estimation stage to track the way that parts of an image move between one frame and the next. A common motion estimator is the block-based type, in which a frame of a video sequence is divided into a number of blocks each containing one or more pixels, and for each block a vector (referred to as a “motion vector”) is found that represents the motion of the pixels in that block between the two existing frames. In one example, the determination of the motion vector for a block of a current frame involves searching the previous frame in the video sequence to find the area of image data of the previous frame with contents that are most similar to the contents of the block of the current frame. Other factors may also be involved in the determination of the motion vector for a block. The motion vectors can be used to produce an interpolated frame at an intermediate position (given by a temporal phase, φ) between two existing frames in a video sequence. For example, if the interpolated frame is to be included at the mid-point between two adjacent existing frames (i.e. if the temporal phase, φ, of the interpolated frame is 0.5) then the motion vectors determined between the two existing frames may be halved (i.e. multiplied by the temporal phase, φ) and then used to determine how the image in one of the existing frames should be changed for use in representing the interpolated frame.
Interpolated frames generated by a frame rate converter may contain Frame Rate Conversion (FRC) artefacts arising from different causes. An example of one such type of FRC artefact is where part of an image has been erroneously eliminated such that it does not appear in the interpolated frame. This type of artefact may be associated with image features that are ‘subtle’, or ‘delicate’, compared to a more dominant background. An image feature may be considered to be “subtle” compared to the background/foreground if the image feature does not significantly contribute to the character of a block of pixels which is to be considered for motion estimation, e.g. if the image feature is small in at least one dimension compared to the size of the blocks of pixels which are considered for motion estimation. An example of such image features may be features in the form of thin lines, such as telephone lines, rigging of sails on a ship, or wire fences.
To illustrate how this type of artefact may arise, consider the following example in which a motion estimator is used to determine the motion vector between a block of pixels in a current frame and a block of pixels in a previous frame with contents that are most similar to the block of the current frame. If the block of pixels in the current frame contains a subtle image feature contrasted against a dominant background or foreground (for example the image feature is a thin line), then the motion estimator may be able to obtain a good match with a block of pixels in the previous frame that does not contain the image feature (for example because the backgrounds/foregrounds of the two blocks dominate the matching process and provide a sufficiently good match). The subtle object and the dominant background/foregrounds may for example have different motion vectors that describe their respective motion between the current and previous frames, but only one motion vector can be associated with the block. If the background/foreground is dominant, then the motion of the background/foreground can provide a good match with a block in the previous frame whereas the motion vector corresponding to the subtle object may give rise to a poor match with a block in the previous frame because it may only represent a good match on a small number of pixels within the area of the subtle object. The motion estimator may thus select a motion vector for the dominant background/foreground objects that results in an interpolated image being generated without the subtle image feature present. Conversion artefacts of this type have the undesirable effect of causing the subtle feature to flicker when the sequence of frames is displayed to a user.