The National Television Standards Committee (NTSC) was responsible for developing a set of standard protocols for television broadcast transmission and reception in the United States. A NTSC television or video signal was transmitted in a format called interlaced video. This format is generated by sampling only half of the image scene and then transmitting the sampled data, called a field, at a rate of approximately 60 Hertz. A field, therefore, can be either even or odd which refers to either the even lines or the odd lines of the image scene. Therefore, NTSC video is transmitted at a rate of 30 frames per second, wherein two successive fields compose a frame.
Motion picture film, however, is recorded at a rate of 24 frames per second. It is often required that motion picture film is a source for the 60 Hertz NTSC television. Therefore, a method has been developed for upsampling the motion picture film from 24 frames per second to 30 frames per second, as required by the video signal.
Referring to FIG. 1, a scheme for upsampling the 24 frame per second motion picture film to the 30 frame per second video sequence is illustrated generally by numeral 100. A first 102, second 104, third 106, and fourth 108 sequential frame of the film is represented having both odd 110 and even 112 lines. In order to convert the film frame rate to a video rate signal, each of the film frames are separated into odd and even fields. The first frame is separated into two fields 102a and 102b. The first field 102a comprises odd lines of frame 102, and the second field 102b comprises even lines of the frame 102. The second frame 104 is separated into three fields. The first field 104a comprises the odd lines of second frame 104, the second fields 104b comprises the even lines of the second frame 104, and the third field 104c also comprises the odd lines of the second frame 104. Therefore, the third field 104c of the second frame 104 contains redundant information.
Similarly, the third frame 106 is separated into a first field 106a comprising the even lines and a second field 106b comprising the odd lines. The fourth frame 108 is separated into three fields wherein the first field 108a comprises the even lines of the fourth frame 108 and the second field 108b comprises the odd lines of the fourth frame 108. The third field 108c comprises the even lines of the fourth frame 108 and is, therefore redundant.
The pattern as described above is repeated for the remaining frames. Therefore, for every twenty-four frames there will be a total of 60 fields as a result of the conversion, thus achieving the required video rate of 30 frames per second.
The insertion of the redundant data, however, can have an effect on the visual quality of the image being displayed to a viewer. Therefore, in order to improve the visual quality of the image, it is desirable to detect whether a 30 frame per second video signal is derived from a 24 frames per second motion picture film source. This situation is referred to as a video signal containing an embedded film source. Detection of the motion picture film source allows the redundant data to be removed thereby retrieving the original 24 frames per second motion picture film. Subsequent operation such as scaling is performed on the original image once it is fully sampled. This often results in improved visual quality of images presented to a viewer.
The upsampling algorithm described above is commonly referred to as a 3:2 conversion algorithm. An inverse 3:2 pull-down algorithm (herein referred to as the 3:2 algorithm) is the inverse of the conversion algorithm. The 3:2 algorithm is used for detecting and recovering the original 24 frames per second film transmission from the 30 frames per second video sequence as described below.
It is common in the art to analyze the fields of the video signal as they arrive. By analyzing the relationships between adjacent fields, as well as alternating fields, it is possible to detect a pattern that will be present only if the source of the video sequence is motion picture film. For example, different fields from the same image scene will have very similar properties. Conversely, different fields from different image scenes will have significantly different properties. Therefore, by comparing the features between the fields it is possible to detect an embedded film source. Once the film source is detected an algorithm combines the original film fields by meshing them and ignores the redundant fields. Thus, the original film image is retrieved and the quality of the image is improved.
A similar process is achieved for PAL/SECAM conversions. PAL/SECAM video sequences operate at a frequency of 50 Hz, or 25 frames per second. A 2:2 conversion algorithm, which is known in the art, is used for upsampling the film to PAL/SECAM video sequence rates. An inverse 2:2 pull-down algorithm (herein referred to as the 2:2 algorithm) is used for retrieving original film frames in a fashion similar to that described for the 3:2 algorithm. PAL Telecine A and PAL T elecine B are two standard PAL upsampling techniques.
PAL Telecine A does not insert repeated fields into the sequence during the transfer from film frame rate to video frame rate. Thus, 24 frames become 48 fields after the Telecine A process. The result of having two fewer fields than the video rate is a 4% (2 fields missing out of the required 50 fields) increase in the playback speed. In order to transfer PAL Film to PAL Video without the 4% speedup, a process called Telecine B is used. Telecine B inserts a repeated field into the sequence every ½ second (i.e. every 25th field). Inclusion of a repeated field produces a sequence that plays back without speedup for a 25 frames per second video rate.
However, the film detection algorithms as described above are subject to problems. Static objects such as subtitles and other icons may be inserted at a video rate after the film has been converted to video. These objects typically cause the film detection algorithm to fail so that the series of contiguous image scenes, that is contiguous frames of film, cannot be properly recovered. The result of these problems is the display of original film images as though they were true video source. It is therefore, an object of the present invention to obviate or mitigate the above mentioned disadvantages and provide a system and method for improving the detection of film in a video sequence.