1. Field of the Invention
The present invention is related to processing image sequences, and in particular, to methods and systems for converting an image sequence intended to be displayed at a first frame rate to an image sequence intended to be displayed at a second frame rate.
2. Background
As is well known, motion film is typically exposed and viewed at 24 film frames per second (fps). By contrast, NTSC video, which applies to television, is typically recorded and played back at 29.97 video fps. The selection of 29.97 fps for video is based on the frequency of electricity in the United States, which is 59.94 Hertz (Hz) or cycles per second. Video typically includes two fields per frame, and therefore, there are typically 59.94 fields per second.
For television, the NTSC color video standard specifies that 525 lines of information are scanned at a rate of 29.97 fps, therefore, each field scans 262.5 horizontal lines. However, typically only approximately 480 lines per frame, or 240 lines per field, are active or illuminated and contain actual picture information. The two fields of a video frame are often referred to as being “interlaced.” The lines of information from the two fields of a respective frame interlace, i.e., alternate, to produce the frame. Thus, one field can contain the odd lines of a frame and the other field can contain the even lines of a frame. The two fields are also respectively referred to as “odd” and “even” fields. In addition, the NTSC video standard is not always used. Many users use proprietary standards that are similar to the NTSC video standard. For example, where a frame is encoded by only one field, the resulting video sequence can include frames with 240 lines of resolution at 60 frames per second or 240 lines of resolution at 30 frames per second.
It is a common practice in the movie and television industry to convert from the film format to the NTSC video format so that filmed works can be broadcast and displayed on a television set. Clips of filmed work are also often transferred to a video format, such as the NTSC video format, because video formats are convenient to store and view as well. Such a conversion is known as a “telecine” process, which typically converts 24 film fps to 30 video fps video (in addition to the resizing or letterboxing to accommodate the difference in screen aspect ratio).
To convert 24 fps of film to 30 fps of NTSC video, duplicate or repeated fields are inserted to “pad” the 24 fps to 30 fps. The first film frame is converted into 2 video fields (1 even field and 1 odd field), the second film frame is converted into 3 video fields (2 even fields and 1 odd field), with two of the video fields being the same, the third film frame is converted into 2 video fields, the fourth film frame is converted into 3 video fields, with two of the video fields being the same, and so on. Thus, the video field to film frame pattern is “2, 3, 2, 3,” where an extra video field is inserted for every other film frame. As a result, 4 frames of film convert to 5 corresponding frames of video. This is referred to as a “three-two (3:2) pull down.” To return the 30 fps of video to the original 24 fps of film, a reverse process, termed inverse telecine, is performed, where frames of video convert to 4 corresponding frames of video. Prior methods rely extensively on manual intervention to perform the inverse telecine process.
One significant difficulty encountered in performing inverse telecine is handling edits, slow motion, special effects sequences, or other special cases, wherein the 2, 3, 2, 3 pattern is interrupted. For example, because of an edit or abort during final assembly, the 2, 3, 2, 3 pattern may be interrupted in the middle and restarted as follows 2, 3, 2 [edit] 2, 3, 2, 3. To correctly return or convert this pattern to the original film pattern, a user locates the pattern break and conventionally resynchronizes the sequence by manually deleting one or more fields. This is a time consuming and expensive process, and in particular, makes difficult the accurate performance of the inverse telecine process on a large number of video clips in a short period of time.
Because of the difficulties encountered in performing the inverse telecine process, the video format is often retained when displaying a clip on a computer. However, the video format can be wasteful because the duplicate frames needlessly occupy bandwidth. Further, the display of duplicate frames causes motion in the clip to transition in a jerky or erratic manner. In addition, where video fields are interlaced, the interlacing of fields based on film frames from different times can produce artifacts, which are visible on a progressively scanned monitor, such as a computer video monitor.