Video signals may have an interlaced format comprising a sequence of fields, where each field represents a subset of the lines of a frame of video content. The video signal may include video content of different types, e.g. content which has an interlaced content type or content which has a progressive content type. The interlaced video signal includes two complementary types of fields which together represent all of the lines of a frame of the video content, albeit maybe not for each frame (i.e. time instance) of the content. For example a first field may represent the odd lines of a first frame, and then the next field in the video signal may represent the even lines of a frame which, depending on the type of content in the video signal, may be the same frame or a different frame to the first frame.
A de-interlacer is used to generate any missing lines of the video signal at each time instance to thereby recover full frames of the video signal. Compared to including full frames at each time instance in the video signal, interlacing allows fewer lines (and therefore less data) to be included in the video signal without a reduction in the frame rate of the video signal (although interlacing may introduce some extra complications and/or errors). Methods for implementing de-interlacing are known in the art (e.g. using interpolation), but in order for a de-interlacer to operate correctly, it needs to know the native temporal patterns of the video content in the video signal, i.e. the cadence of the video signal, as explained below. Therefore, a video processing unit may implement cadence detection in order to identify the cadence of a video signal. Having detected the cadence of the video signal the de-interlacer can correctly apply any necessary de-interlacing to the fields of the video signal.
Most video broadcast systems transmit video signals in an interlaced format, which includes a sequence of fields at a rate of, for example, 50 or 60 fields per second. As described above, the video signal may include video content of different types, e.g. interlaced or progressive. For example, films are usually shot at 24 or 25 frames per second, while video contents for TV may be shot at 50 or 60 frames per second. However, for each of these content types, the video signal is often broadcast in an interlaced format, such that a sequence of fields are included in the video signal at a field rate of e.g. 50 or 60 fields per second, with each field representing alternately the even and odd lines of the frames.
FIGS. 1a to 1d show how a sequence of frames (A, B, C, D, . . . ) is represented by the fields of the video signal in different examples which have different cadences. FIG. 1a shows a sequence of fields 102 which represent interlaced content for frames A to J. Each of the frames A to J represents a different time instance in the video content. The top line of fields in the sequence 102 represent the odd lines of the frames A, C, E, G and I, and the bottom line of fields in the sequence 102 represent the even lines of the frames B, D, F, H and J. The cadence of the sequence of fields 102 represents interlaced content (i.e. each field represents a different time instance). In order to determine the frames of the video content, a de-interlacer will generate the complementary fields at each time instance to thereby generate a sequence of frames 104. The de-interlacer generates the complementary fields A′ to J′ to represent the lines of the fields which are not included in the sequence 102, such that the resulting sequence of frames 104 includes all of the lines of the frames (both odd and even lines) for each of the frames A to J. It is noted that the complementary fields A′ to J′ include approximations of the original lines of the frames. The frames 104 can then be output (e.g. at 60 frames per second) to thereby output the frames A to J of the video signal. Methods for performing de-interlacing are known in the art.
FIG. 1b shows a sequence of fields 106 which represent progressive content for frames A to E, with a 2:2 cadence (which may simply be written as a 22 cadence). Progressive content often comprises images derived from film. For example, the video content may be a film at 25 frames per second and the video signal may have 50 fields per second such that each frame of the film can be represented over two fields of the interlaced video signal. Therefore, as shown in FIG. 1b, each frame is split into a top field (e.g. including odd lines) and a bottom field (e.g. including even lines). Therefore, the first two fields in the sequence 106 both relate to the same time instance, i.e. to frame A, then the next two fields both relate to the next time instance, i.e. to frame B, and so on. Fields may then be paired together accordingly to thereby generate a sequence of frames 108 which is to be output at 50 frames per second. As can be seen in FIG. 1b, each frame of the progressive video signal is repeated in the sequence of frames 108 such that the frames are output at the correct rate.
FIG. 1c shows another example in which a sequence of fields 110 represents progressive content for frames A to D, but this time with a 32 cadence. For example, the video content may be a film at 24 frames per second and the video signal may have 60 fields per second such that two frames of the film can be represented over five fields of the interlaced video signal. Therefore, as shown in FIG. 1c, a first frame (e.g. frame A) is split into a top field (e.g. including odd lines) and a bottom field (e.g. including even lines) whilst the next frame (e.g. frame B) has three fields relating to it: a top field then a bottom field and then a repeated top field. The sequence repeats as shown in FIG. 1c such that two fields relate to frame C and three fields relate to frame D. Fields may then be paired together accordingly to thereby generate a sequence of frames 112 which is to be output at 60 frames per second. As can be seen in FIG. 1c, frames A and C of the video signal are included twice in the sequence of frames 112 whilst frames B and D of the video signal are included three times in the sequence of frames 112, such that, on average, the frames are output at the correct rate. It is relatively straightforward, when the cadence of the video is known, to discard the repeated frames to recover the original (e.g. 24 frames per second) progressive video signal. In some systems the original progressive video signal may then undergo further processing such as interpolation, to increase the frame rate for display at, for example, 60 frames per second.
Similarly to FIG. 1c, FIG. 1d shows another example in which a sequence of fields 114 represents progressive content for frames A to D, but this time with a 2332 cadence. As with FIG. 1c, the video content may be a film at 24 frames per second and the video signal may have 60 fields per second such that two frames of the film can be represented over five fields of the interlaced video signal. Therefore, as shown in FIG. 1d, a first frame (e.g. frame A) is split into a top field (e.g. including odd lines) and a bottom field (e.g. including even lines) whilst the next frame (e.g. frame B) has three fields relating to it: a top field then a bottom field and then a repeated top field. In contrast to the example shown in FIG. 1c, three fields relate to frame C and two fields relate to frame D. Fields may then be paired together accordingly to thereby generate a sequence of frames 116 which is to be output at 60 frames per second. As can be seen in FIG. 1d, frames A and D of the video signal are included twice in the sequence of frames 112 whilst frames B and C of the video signal are included three times in the sequence of frames 112, such that, on average, the frames are output at the correct rate.
Many other cadences are possible in a video signal for use with different relationships between the temporal characteristics of the video content and the field rate of the video signal. For example, some other possible cadences with progressive video content are 2224, 32322, 55, 64 and 8787, and a person skilled in the art will appreciate that there are other possible cadences also.
It can be seen that characteristic temporal patterns arise when progressive video sequences are transmitted in an interlaced video format. Therefore, when a video signal is received, before de-interlacing is applied, cadence detection is applied in order to identify the type of content in the video signal (e.g. interlaced or progressive) and when progressive content is included to identify the cadence of the video signal. When progressive content is included in the video signal the identified cadence can be used to recover the original progressive frames by re-combining field pairs, and when interlaced content is included in the video signal, the de-interlacer can be used to interpolate the missing lines.
A typical approach to cadence detection is to detect high spatial frequency artefacts (known as “mouse teeth” artefacts) which are present when combining non-paired fields of a video signal that contains motion. For example, if the bottom field for frame A was combined with the top field for frame B then because the fields are from different frames (i.e. from different time instances) then some of the content will not align properly and there will be artefacts where the content does not align correctly. When there is motion in the content between the two frames making up the combined image, the artefacts may occur on every line of the combined image, such that the artefacts (which are the “mouse teeth” artefacts) have a characteristic spatial frequency corresponding to the number of lines in the image. The presence of artefacts with this characteristic spatial frequency can be detected. The presence of “mouse teeth” in a combined image indicates that the field pairing was incorrect and that a different field pairing should be selected. In this way the characteristic temporal patterns in the video signal can be identified. The temporal patterns in the video signal can then be used to identify the cadence of the video signal from a list of known possible cadences. However, this approach to cadence detection has the following limitations: (i) it has a lack of robustness to unknown cadences because it must have an awareness of possible cadences a priori; (ii) it is unlikely to be robust to bad-edits in the video content, where the cadence of the video signal is interrupted by editing of the video signal; and (iii) it is not robust to video content with multiple cadences, for example when a video text overlay (in an interlaced form) is added over the top of film (which has a progressive form).