This invention is related to a method and apparatus for defining special effects in a non-linear video editing system. More particularly, special effects can be defined through a graphical user interface on a field-by-field basis, that is, within a video frame.
A non-linear video editing system, such as the Media Composer(copyright) video editing system available from Avid Technology, Inc. of Tewksbury, Massachusetts, stores video and audio data as files on storage media such as a hard disk drive. A non-linear editing system permits an editor to define a video program, called a composition, as a series of segments of these files called clips. Each clip is labeled and information related to it e.g., length of clip, number of frames, time, etc. are also accessible. Since each clip is stored separately, it can be retrieved directly without having to pass through other clips. Clips can be retrieved and played back in any order. As a result, the clips can be sequenced in any order and this order can be changed any number of times as controlled by a user interface on the Media Composer(copyright) system. When a final composition is defined, the entire composition can be saved to a file for further processing or transmission. Digital Filmmaking, The Changing Art and Craft of Making Motion Pictures by Thomas Ohanian and Michael Phillips, (copyright)1996, includes additional descriptions of these editing systems and is hereby incorporated by reference.
When special effects are to be introduced within the sequence of clips an editor will modify an image by either removing a portion or specifying where an effect is to be placed. The image can be modified in a number of known ways such as pixel-by-pixel intraframe manipulation or other xe2x80x9cpaintingxe2x80x9d programs. Often, painting or pixel manipulation is insufficient to provide the results that are needed. The shortcomings of these known programs are due to the manipulation of images on a flame-by-frame basis when it is necessary to pay attention to the next level of detail, i.e., the field data.
In a video format, images are typically captured in a series of video fields. These video fields are made up of hundreds of horizontal scan lines that are essentially xe2x80x9cslicesxe2x80x9d of the image in the video field. Each scan line is made up of a plurality of pixels. The raw video data that forms the pixels is referred to as YUV data. Each pixel has varying YUV values that can be converted into varying red, blue and green (RGB) level values to determine the color of the pixel. In order to conserve bandwidth in the playback of the video images, consecutive fields are interlaced to make one composite video frame from two consecutive video fields. Interlacing is done by vertically alternating horizontal scan lines from each consecutive field to form one video frame. In the NTSC video format, video images are captured at 60 fields per second. Interlacing two consecutive fields results in video that is transmitted at 30 frames per second. There are other video formats that have different scan rates, such as PAL, which has a scan rate of 50 fields per second or 25 frames per second.
Video field interlacing is schematically shown in FIGS. 1A and 1B. FIG. 1A shows two consecutive video fields A1, A2. Each of the video fields A1 and A2 consists of hundreds of horizontal scan lines that make up an image. In FIG. 1A, the scan lines that make up field A1 are labeled AL1, and the scan lines that make up field A2 are labeled AL2. FIG. 1B schematically shows how fields A1 and A2 are interlaced to form video frame A12. As shown in the figure, video frame A12 comprises the scan lines AL1 and AL2 in an alternating fashion from the top of the frame to the bottom of the frame. This interlacing of video fields A1 and A2 results in the video transmission rate of approximately 30 frames per second.
Conventionally, once the video fields are interlaced, editing is performed either by displaying both fields, or by displaying just one field and doubling the scan lines to fill the frame.
Displaying both fields presents a problem because there is a slight timing offset between each field. Therefore, when the fields are interlaced to form the video frame, the image may be somewhat choppy or blurred due to the difference in time between the images in each field. During the editing process, objects that are moving in an image of the video field cannot be accurately outlined, because moving objects will be displayed in two separate locations in the frame, one location for each video field.
Doubling the scan lines of a field is done by a process called xe2x80x9cscan doubling.xe2x80x9d In the process of scan doubling, each scan line in the video field is doubled, in order to fill the entire frame. The doubled field is then edited. However, since there is a difference in time between each video field, scan doubling tends to display data in the video frame that may be false or misleading, since it is compensating for the time offset between each video field. Scan doubling causes half of the spatial information of the frame to be lost because it is contained in the video field that is not shown in the scan doubled frame. Scan doubling is particularly problematic when the fields contain still or slow moving objects, since the information that is lost is still accurate information. This loss of data tends to make editing of the fields difficult, inaccurate and time consuming. Additionally, there is still some xe2x80x9cjitterxe2x80x9d when moving from field to field since the xe2x80x9ctop-mostxe2x80x9d line of the frame is only found in one field.
It is also possible that there is xe2x80x9cmotionxe2x80x9d within a frame, that is, an image""s motion is not consistent from the first field in the frame to the second field. Special effects editors need access to the second field in order to accurately place the effect or to accurately modify the image. Typically, non-linear video editing systems do not display the second field of a flame, only the first field has been displayed with scan doubling. Since only the first field is being viewed, any differences between the first and second fields will not be accounted for and the effect will not be accurate.
Thus, any special effects that are necessary must be implemented on a separate effects system, the modified clip created and then input to the non-linear editing system. If changes are necessary, the editor must return to the separate effects system, modify the clip and return it to the non-linear editing system. This is a time-consuming and, therefore, inefficient process that hampers the creativity of the editor.
Additionally, in a non-linear editing system, as described above, clips are sequenced together to create a final composition. In some instances, the source for a clip might be film which has been converted from its scan rate of 24 frames per second to that of NTSC video, 30 frames per second. It is possible, therefore, that an inter-frame scene change could occur. In other words, the first field of the frame may be unrelated to the second field. If this were to happen in a non-linear editing system that only displayed the first field of a frame, then this xe2x80x9cdiscontinuityxe2x80x9d would be unseen and could possibly affect any post-processing done to the video image.
Accordingly, there is a need for a non-linear video editing system that provides editing and viewing capabilities on a field by field basis. This would allow for special effects editing on a field, thus providing more detailed results. Additionally, a non-linear video editing system is needed that has the capability to edit either field of a frame.
A non-linear video editing system, with the capacity to display and modify video frames on a field-by-field basis, has the capability to implement special effects on a field-by-field basis, thus increasing the accuracy of the effect. A user may incrementally step through footage, so that every field can be seen. A time code display indicates which field the user is viewing or manipulating. A keyframe can be attached to a field and interpolation can be performed between fields marked with a keyframe.
A graphical user interface may be provided through which a user can select a specific clip among many clips, select or mark a particular frame within the clip, select or mark either field within a frame, display either field, edit either field and insert a special effect in either field. A keyframe scale bar is provided to allow the user to xe2x80x9czoom intoxe2x80x9d a position in time. The user can then make subtle adjustments to key frame selection and positioning.
A rotoscoping mode allows a user to work on shape selection where shapes cannot be interpolated easily. When the mode is turned on, the shapes are visible and are active only for the keyframes on which they are created. This allows a user to work quickly on sequences where there is little or no coherency between fields.
In one embodiment of the present invention a non-linear video editing system for processing a video image includes means for displaying the video image; means for selecting a frame within the video image; means for displaying one of a first field and a second field of the selected frame; means for indicating which field is being displayed; means for marking at least one of the first field and the second field; and means for modifying at least one of the first field and the second field.
In another embodiment, the modifying means comprise means for deleting a portion of the field. Additionally, the system includes a graphical user interface to receive information from and present information to a user relative to at least one of the first field and the second field.
In yet another embodiment of the present invention, a computer-implemented method of adjusting for a positional displacement between a displayed first field of a first frame and a displayed second field of the first frame includes storing flame data of the first frame in a frame buffer, the flame data including scan lines of the first and second fields; accessing the frame data; determining a top-most scan line and its associated field from the accessed frame data; and adjusting the display of the field not associated with the top-most scan line.
In a further embodiment, the display adjusting step includes inserting and displaying a blank line at a beginning of the field not associated with the top-most line; scan doubling all lines other than a last line in the field not associated with the top-most line; and displaying the last line of the field not associated with the top-most line only once.