The present invention relates to the technology of video signal processing and more particularly to analysis methods usable to determine the arrangement in time of the fields forming an interlaced video signal.
Such analysis methods are useful for detecting the cadence of the interlaced video signal, which is an important detection to decide which processing should be applied to the signal for display. From the following discussion, it will be seen that the determination of the time arrangement of the fields in the sequence may also involve other aspects such as detecting certain spurious field inversions.
Current television standards (e.g. PAL, NTSC, 1080i, . . . ) use interlaced signals, with frames split into two fields, one containing the odd lines of the frame and the other one containing the even lines. A deinterlacer using line duplication or interpolation is used when the display or other processing applied to the signal needs to recover or synthesize full frames. Any deinterlacer must know the native temporal frequency of the input, i.e. the cadence of the signal.
Video transmission technologies use different cadences depending on the source of the signal and characteristics of the transmission channel. For example, films are usually shot at 24 frames per second, while video contents for TV are shot at 50 frames per second in Europe and at 60 frames per second in America. The contents are mostly broadcasted in interlaced form, which means that of each frame, alternatively the even and odd lines are actually transmitted. These formats are denoted as 50i (for 50 interlaced fields per second in the PAL world) or 60i (60 interlaced fields in the NTSC world). The content is also sometimes broadcasted in progressive form. The resulting formats are then denoted 50p or 60p, where “p” is for “progressive frames”. Obviously the problem of deinterlacing arises for 50i or 60i contents only.
In Europe, the PAL channel (or 1080i50 for HDTV) assumes a frame refresh rate of 50 Hz on the display side. The frame rate of a film is accelerated from 24 to 25 Hz when broadcast in European TV channels or recorded in optical storage media intended for the European market. A sequence of frames A, B, C, D, . . . from the source becomes, in the interlaced video signal, a sequence of fields:                A−, B+, C−, D+ (“+” for fields made of even lines, “−” for fields made of odd lines) if the source is “video”. The cadence is then said to be video (each field is shot at a different time);        A−, A+, B−, B+, C−, C+, D−, D+ if the source is “film”. The cadence is then referred to as 2:2 pulldown.        
In America and Japan, the NTSC channel (or 1080i60 for HDTV) assumes a frame refresh rate of about 60 Hz on the display side. A sequence of frames A, B, C, D, . . . from the source becomes, in the interlaced video signal, a sequence of fields:                A+, B−, C+, D− if the source is “video”. The cadence is then said to be video;        A+, A−, A+, B−, B+, C−, C+, C−, D+, D− if the source is “film”. The cadence is then referred to as 3:2 pulldown.        
Other cadences or pulldown modes exist for interlaced signals having a field rate more than twice the frame rate of the source, for example 2:2:2:4 pulldown, 2:3:3:2 pulldown, 3:2:3:2:2 pulldown, etc. Those other pulldown modes, as well as 3:2, are fairly easy to detect because certain fields are exactly repeated at predictable intervals. For instance, the 3:2 pulldown case as exemplified above is detected by correlating each field with the one appearing two field positions afterwards: a correlation peak every five fields (corresponding to the repetition of A+ and C− in the example) then reveals the 3:2 pulldown mode. Different time correlation patterns are indicative of different pulldown modes and can be used to detect the relevant cadence in order to apply the appropriate downstream processing.
However, this kind of detection with time correlation cannot be used to detect 2:2 pulldown which is the most difficult film cadence to detect. When the cadence is 2:2, each field is sent only once. Hence the cadence detection technique must rely on some sort of regularity assumption in order to detect that successive fields correspond to the same temporal position.
Typically, a cadence detector handles the 2:2 case by comparing how a given field Fi relates to both Fi−1 and Fi+i (i denoting an integer rank for the fields of the sequence). The metric used to compare fields can be a simple L1 or L2 distance. If a global bias of regularity is detected, e.g. if the metric between pairs of fields of ranks (2k, 2k+1) is much lower than the metric between pairs of fields of ranks (2k−1, 2k), or vice-versa, then the algorithm decides to switch to 2:2 mode and deinterlacing is replaced by reverse 2:2 pulldown. If there is no bias of regularity, the video mode is considered detected and deinterlacing takes place.
The key aspect of these cadence detectors is that they rely on spatial regularity assumptions on the input frames. On frames with high frequency contents however, these assumptions do not hold and thus the algorithms do not correctly detect the 2:2 cadence.
There is thus a need for an interlaced signal analysis method with improved performance, in particular capable of efficiently detecting a 2:2 cadence.