1. Field of the Invention
The present invention relates to a source detection method, and particularly to a method for 3:2 pull-down film source detection taking into account the difference between two fields and the interlaced frame information.
2. Description of the Related Art
In practice motion picture sequence can be loosely classified into film (or movie) source and video source. The frame rate of a film source is 24 frames per second while the frame rate of an NTSC video source is 30 frames (or 60 fields) per second. Therefore, to show a film on an NTSC TV system the film frame rate has to be converted from 24 to 30. This frame rate conversion is often called the 3:2 (or 2:3) pull-down process.
FIG. 1 shows one example of the 3:2 pull-down process applied to a film segment of four frames (A, B, C and D). In this case, the original film frames are separated into three or two interlaced fields. That is, frame A and frame C are separated into three fields by duplicating one of their fields, while frame B and frame D are only separated into two fields. The frame rate is therefore converted from 24 frames per second for the original film sequence to 60 fields per second for the interlaced field sequence, and then to 30 frames per second for the interlaced frame sequence.
Since the 3:2 pull-down film sequence is suitable to be played on NTSC interlaced TVs, some annoying comb artifacts remain in the interlaced frames merged from different film frames, such as frame ‘A+B’ and frame ‘B+C’ in FIG. 1, if the 3:2 pull-down film sequence is played on progressive TVs or computer monitors. In order to remove the comb artifacts, it is important to detect the 3:2 pull-down film sequence and to apply an inverse process called inverse telecine process to the 3:2 pull-down film sequence to recover the original film frames. Therefore, it is important to recognize and detect the 3:2 pull-down film sequence or the interlaced video sequence for use in different devices.
Further, the 3:2 pull-down film sequence has a unique signature due to the duplication of interlaced fields. The signature is illustrated in FIG. 2 and explained as follows. The fragment of the interlaced fields sequence contains the 3:2 pull-down film source. If the interlaced fields of the same type (i.e., top or bottom) are compared, the comparison result is “10000100001 . . . ’, where 1 represents match and 0 represents no match. Hence, the detection between the 3:2 pull-down film sequence and the interlaced video sequence can be performed by alternatively comparing the fields of the same type and seeking the signature of ‘10000100001 . . . ’ in the sequence.
FIG. 3 shows a conventional method for 3:2 pull-down film source detection, discussed accompanying FIGS. 2 and 3 as follows. First, the field index n, the MatchCounter and the ModeCounter are set to 0 (S301). The MatchCounter records how many times the 3:2 pull-down signature ‘10000’ has been detected, and the ModeCounter is used as an indicator to signal whether the 3:2 pull-down signature is correct. These two counters are the key indicators for 3:2 pull-down film source detection and will be more clearly explained in the following.
Second, in step S302, two fields of the same type are received (S302) and compared to see if they are identical due to duplication. The comparison is performed by calculating the field difference FieldDiff (S303) given by the sum of absolute difference as,
      FieldDiff    =                  ∑                  y          =          0                          M          -          1                    ⁢                        ∑                      x            =            0                                N            -            1                          ⁢                                                      F              ⁡                              (                                  x                  ,                  y                  ,                  n                                )                                      -                          F              ⁡                              (                                  x                  ,                  y                  ,                                      n                    +                    2                                                  )                                                                    ,
where M and N are the field height and width respectively. If the FieldDiff is below a threshold Fi_th (yes in step S304), these two fields are recognized as match and the MatchCounter is incremented by 1 and the ModeCounter is cleared to 0 to indicate that the beginning of the signature ‘10000’ (S305). Otherwise, these two fields are not matched and the ModeCounter is incremented by 1 (S306) if the FieldDiff is larger than the threshold Fi_th (no in step S304).
Then, the mode of the source sequence is determined based on the values of the MatchCounter and ModeCounter as illustrated in FIG. 2. If the MatchCounter is larger than 1 and the ModeCounter is equal to 0 (yes in step S307), the flag FilmMode is set to 1 to indicate that a 3:2 pull-down film sequence has been detected (S308). Otherwise, the value of the ModeCounter is used to determine if the sequence follows the signature ‘10000’. If the ModeCounter is smaller than or equal to 4 (no in step S309), the flag FilmMode is not changed (S310). However, if the ModeCounter is larger than 4, indicating that the sequence no longer follows the 3:2 pull-down signature ‘10000’ (yes in step S309), the flag FilmMode is set to 0 to indicate that the sequence is not a 3:2 pull-down film sequence (S311).
To prevent overflow, the ModeCounter is set to a max_count (S313) if ModeCounter exceeds the predetermined value max_count (yes in step S312). This process repeats along the input sequence for dynamically monitoring the 3:2 pull-down signature (S315 and return to S302) unit the sequence is finished (yes in step S314).
However, conventional methods have two drawbacks. First, due to information loss by digital video compression and digital video processing, the difference between the duplicated fields may exceed the difference threshold Fi_th. Therefore, it is not accurate to determine match by employing the FieldDiff and threshold Fi_th.
Second, the detection for the bad editing point is not effective. For instance, in FIG. 4, the 3:2 pull-down sequence contains good and bad editing points. Since the good editing point follows the 3:2 pull-down order, the signature of ‘10000’ is maintained. Conversely, if the bad editing point breaks the 3:2 pull-down order, the signature no longer follows ‘10000’. However, such a bad editing point will not be detected until the ModeCounter exceeds 4. Consequently, the output film frames between the bad editing point and the point of detection will be wrongly reconstructed using the inverse telecine process. That is, there will be two frames reconstructed from the merge of field ‘H’ and ‘I’, leading to significant reconstruction errors.