Completed video and film programs are generally composed of segments from several sources. The programs are typically assembled by an editor who views the sources of material that are available and chooses the segments that will make up the final program. However, the program is not usually edited at the same place or time as the physical production of the final video tape or film; instead, the final production occurs at a facility equipped to produce the high-quality editing that is required for the final product. Therefore, the original editor of the program must generate a set of editing instructions to be used in the construction of the final program, which is commonly automated using computer technology.
A set of editing instructions for video or film programs is often produced in a format called an edit decision list (EDL). A conventional EDL consists of a sequence of editing instructions, each of which is a computer instruction for a computerized edit controller which assembles a final program from source material. An editing instruction represents an event description, where each event is a transition to a new program segment. There are a number of available EDL formats, but each conveys similar information. The event description contains such information as the source of the new program segment, the time codes describing both the portion of the source that will be recorded and its destination in the final program, and the type of edited transition that is to be used from the previous segment. Using the information represented by editing instructions, the final program can be automatically constructed from the several sources of program material.
Because an editor may choose an output media format different from the input format, the EDL must take into account these format changes to precisely identify the transition to a new segment. For example, film is shot at 24 fps (frames per second) progressive; NTSC video is recorded at 30 fps and includes two interlaced fields that represent a single frame. It is to be understood that the use of the term “NTSC video” throughout this description is to mean a video signal, either analog or digital, that adheres to NTSC timing. The PAL format for video, which is extensively used in Europe, is recorded at 25 fps interlaced. Similarly, it is also to be understood that the use of the term “PAL video” throughout this description is to mean a video signal, either analog or digital, that adheres to PAL timing. If film is to be transferred to a NTSC video format, more frames must be added to satisfy the higher frame rate of NTSC video. The addition of these frames is performed in a well known manner described as “pulldown”. Each frame from the film is converted to either 2 or 3 fields of video data, and consequently, 24 frames of film are thus converted to 30 frames of NTSC video having 60 fields in the aggregate, or to 25 frames of PAL video having 50 fields. While pulldown resolves the difference in frame rate between film and video, the process complicates the identification of the edit points within a work.
Recently, editing of film at 24 frames progressive has generally become the editing approach of choice for professional film editors. The film is digitized, digitally edited on a Digital Non-Linear Editing (DNLE) system in a frame format and only converted to video after the editor requests video as the desired output format.
In conventional systems the output videotape format is locked to the standard of the source material. In other words, the frames specified in the EDL are the same format as the source material. All the frames adhere to NTSC or PAL frame timing. Conventional systems do not permit the mixing of video formats in the same EDL. This limitation clearly restricts the source material for inclusion into the work, and at best requires that the format of the source be converted to the output format.
It would thus be desirable that the final version of a video production, after the editing, permits a combination of scenes recorded in different formats. The completed video work is thus a combination of different sources transferred to video. This combination includes but is not limited to NTSC video, PAL video, and film converted to NTSC or PAL video by the telecine process.
Often the completed work is digitally encoded using a well-known standard such as MPEG-2 to compress the data for efficient storage or transmission of the work. Because the quality of the compression is dependent upon the original source of the video (i.e. whether the frames were converted from a film source by a telecine or recorded in native video) it is advantageous to identify the original source of each frame and field within the video. Knowing this information, the compression algorithm can more efficiently encode a video frame that was originally recorded on film and converted to video, by eliminating any redundant fields. Therefore it would be advantageous to provide a mechanism to identify the original source material associated with the output video for better compression.
The combination of video segments derived from both native video and film also presents other instances in which it would be beneficial to know the pull down sequence of the video and also the editing points for the video. For example, color correction may be required to provide consistency of color across the different video clips. Therefore, it would be advantageous to know the scene changes in the film and the corresponding video fields in order to identify the fields to correct.