The present invention relates to a system for detecting and sequencing interlaced video signals derived from film having successive image frames.
Typical broadcast television uses frames of interlaced video, each frame consisting of two fields. The first field normally contains the odd numbered lines of the frame and the second field normally contains the even numbered lines of the frame. Typical display systems display the fields sequentially on a cathode-ray tube.
A special problem arises when the interlaced fields originating from movie film are converted to a progressive scan display suitable for high definition television or computer monitors. Movie film is a progressive scan material. Systems operating in accordance with NTSC (broadcast television) typically convert the film source to a 60 hertz interlaced format by using a 3:2 pull down technique, as shown in FIG. 1. Film provides sequential complete image frames, as shown by frame boundaries 12, 14, and 16. The first frame between boundaries 12 and 14 is scanned such that the even numbered lines go to an even field between boundaries 18 and 20, and the odd numbered lines go to an odd field between boundaries 20 and 22.
The even numbered lines scanned from the first frame also go to the even field for frame 2. The odd numbered lines scanned from the second film frame between boundaries 14 and 16 go to the odd field of frame 2. The even numbered lines for the even field of frame 3 come from the second film frame. As shown by this example, this encoding technique is referred to as a 3:2 pull down because every other frame contributes three (3) fields to the interlaced video and the offset frames contribute two (2) fields to the interlaced video.
Being able to determine whether video material is derived from a 3:2 pull down can be advantageously used in subsequent video signal processing systems, such as high definition television receivers or digital video compression systems. For example, motion-compensated picture signal processing methods are potentially very suitable to provide an improved display quality of the picture signal, but artifacts caused by motion vector estimation errors as a result of duplicate identical fields are very disturbing. Consequently, there is a need to determine when motion vectors are reliable enough to allow for a motion-compensated picture signal processing mode. There is also a need to determine when the motion-compensated picture signal processing mode should be switched off in view of the unreliability of the motion vectors.
Faroudja, U.S. Pat. No. 4,876,596, discloses a system for converting a 3:2 interlaced video format to a progressive video format by detecting with a code included in the interlaced video which fields are from the three-segment frame and which fields are from the two-segment frame. The code is read by a processor prior to displaying the video to select the appropriate fields to display. Unfortunately, such an approach requires an explicit specification of the 3:2 pull down by incorporating a code within the video which may not be provided by many encoding systems.
Krause, U.S. Pat. No. 4,881,125, discloses a system for providing a progressive-scan video display signal from an interlaced video signal derived from progressive film. Krause teaches combining the currently received video field and a delayed video field to provide a progressive-scan video frame signal at the video rate, in which alternate lines are derived respectively from odd and even video fields. The technique used by Krause to combine fields is generally referred to as "jamming," which is the putting together of two interlaced fields into a single frame. However, Krause suggests the use of a sync signal transmitted during the vertical blanking interval of a video signal to indicate the beginning of a 3:2 sequence. The sync signal approach requires an explicit specification of the 3:2 pull down which may not be provided by many encoding systems.
Neither Krause nor Faroudja suggest how to determine whether the interlaced video originated from a film source, such as a 24 frames per second material, without a control signal being provided. Moreover, neither Krause nor Faroudja suggest how to determine the sequence of the 3:2 frames without the control signal being provided.
Van der Meer et al., U.S. Pat. No. 4,933,759, disclose a motion detection system based on picture signal value comparisons between picture elements in consecutive interlaced television pictures (n-2, n-1), (n, n+1), (n+2, n+3), motion or no motion, respectively, being determined in dependence on the fact whether comparison results exceed or do not exceed a threshold value. A picture element in a first field of a television picture (first field of the particular frame) is compared with a number of surrounding picture elements in a second subsequent field (second field of the particular frame). A corresponding picture element in the second subsequent field is likewise compared with a number of surrounding picture elements in the first field. Accordingly, Van der Meer et al. teach a system that compares combinations of picture elements within both the same field and between a set of six fields in order to match the fields with the lowest differences as being the repetitive field of the 3:2 pull down for synchronization. However, Van der Meer et al. fail to specifically address the issue of whether or not the interlaced video originated as film. Further, Van der Meer et al. employ a fixed threshold value for the comparisons which may result in inaccurate results if the data has significant amounts of noise. The noise may originate from many sources, for example, transmission noise and noise on the film recording medium.
Katznelson et al., U.S. Pat. No. 4,998,287, disclose a system for determining the synchronization of 3:2 interlaced video by using a total difference of an analog field comparison together with a threshold. The system of Katznelson et al. is also analog based, including buffers and delay lines, which is unsuitable for digital based systems. Katznelson et al. also fail to specifically address the issue of whether or not the interlaced video originated as film.
Casavant et al., U.S. Pat. No. 5,317,398, disclose a 3:2 pull down detector that includes circuitry for generating the differences between corresponding pixel values in two fields, separated by a field, of a video signal. Ignoring noise, any parts of the image which are identical in both fields result in a zero frame difference. These differences are applied to a coring circuit which excises differences having values less than a predetermined amplitude. That is, small frame differences are set to zero. The cored differences are accumulated by an accumulator which sums the magnitudes of the cored difference signal for each field. Accordingly, Casavant et al. produce a single value for each frame that indicates the degree of difference or motion between the current field and the field that occurred two fields prior. Accumulated values for respective frames are compared with a five point average to eliminate spikes between frames. The result is applied to a signal averager and to a correlation circuit. Average values from the averager are subtracted from correlation values from the correlation circuit. A film mode signal is indicated if the latter differences are greater than a predetermined value. As shown in FIG. 4 of Casavant et al., the cored difference being above a threshold value indicates that the source of the video was film based and the timing of the spikes indicates the synchronization of the frames. Unfortunately, the thresholding does not permit the discrimination between film that was originally filmed at either 30 hertz or 24 hertz. Accordingly, additional processing is required to perform such a discrimination. Further, the system taught by Casavant et al. is sensitive to noise.
De Hann et al., U.S. Pat. No. 5,365,280, disclose a method of controlling a picture signal processing mode using motion vectors to determine if the original source was film. Unfortunately, the computation of the motion vectors is computationally intensive and prohibitively expensive for inexpensive real time systems.
Gove et al., U.S. Pat. No. 5,398,071, disclose a film-to-video format detector for a digital television receiver. The detector receives pixel data from a current field and a second preceding field and then calculates a set of pixel difference values. The pixel difference values are added together to obtain a total field difference value. The total difference value is compared against a predetermined threshold value. These steps are repeated to obtain a series of total difference values that are analyzed to determine whether it has a pattern corresponding to a film-to-video format. Unfortunately, the system taught by Gove et al. is sensitive to noise.
What is desired, therefore, is a system for detecting and sequencing video signals derived from film having successive image frames that is insensitive to noise. Further, the system should not be computationally intensive.