Editing systems are used in video tape productions to combine selected video scenes into a desired sequence. A video editor (hereafter "editor") communicates with and synchronizes one or more video tape recorders ("VTRs") and peripheral devices to allow editing accurate within a single video field or frame. A user communicates with the editor using a keyboard, and the editor communicates with the user via a monitor that displays information.
In film editing, the person editing can position film segments to be spiced together in close proximity to decide which frames of the segments should be spliced together to make a smooth transition. But this editing method has hitherto not been possible with video tape. When editing video tape, the operator must perform repetitive forward and backward motion of the video tape in the tape machine, called "jogging", to find the precise edit points while observing the video images from the tape on a monitor. The required control of the video tape machine is difficult to achieve, the editing is time-consuming, and requires subject judgment on the part of the operator, which judgment is gained only after much editing experience.
"Off-line" editing systems are relatively unsophisticated, and are most suitable for reviewing source tapes, and creating relatively straightforward editing effects such as "cuts" and "dissolves". Off-line editors generate an intermediate work tape whose frames are marked according to an accompanying edit decision list ("EDL") that documents what future video changes are desired. By contrast, "on-line" editing systems are sophisticated, and are used to make post-production changes, including those based upon the work tape and EDL from an off-line editor. On-line editing systems must provide video editor interface to a wide variety of interface accessories, and the cost charged for the use of such a facility (or "suite") often far exceeds what is charged for using an off-line system. The output from an on-line editing system is a final video master tape and an EDL documenting, at a minimum, the most recent generation of changes made to the master tape.
Originally editors interfaced only with multiple VTRs, and later with switchers as well. A switcher is a peripheral device having multiple input and output signal ports and one or more command ports. Video signals at the various input ports are fed to various output ports depending upon the commands presented to the command ports by the editor.
A "cut" is the simplest editing task and is accomplished with an editor and two VTRs: VTR A holds video scenes to be cut into the video tape on VTR B. The editor starts each VTR in the playback mode and at precisely the correct frame, commands VTR B to enter the record mode, thereby recording the desired material from VTR A. It is not a trivial task for the editor to control and synchronize all VTRs and peripheral devices to within a single frame during an edit, since one second of video contains 30 individual frames or 60 fields.
In a more complicated "dissolve" transition, the editor must precisely control three VTRs and a production switcher (a device capable of gradually mixing two video sources). VTRs A and B contain video scenes to be dissolved one to the other. The video outputs of VTRs A and B are connected to inputs on the production switcher, with the switcher output being connected to the record input of VTR C. The editor synchronizes all three VTRs and, at precisely the proper frame, activates the switcher, allowing VTR C to record the desired visual effect. Troublesome in perfecting the dissolve effect was the fact that the command port of the production switcher did not "look like" a VTR to the editor.
As newer devices such as special effects boxes appeared, editors were forced to adopt still another interface approach, namely a general purpose interface ("GPI"). Rather than transmit a command through a serial communications port, a GPI trigger pulse was transmitted from the editor to command a given function within a newer device. In essence, the GPI pulse performed a relay closure function for the remote device. For example, a special effects box might have three GPI input ports: a pulse (or relay closure) at the first port would "start" whatever the effect was, a pulse (or relay closure) provided to the second port would "stop" the effect, while a pulse (or relay closure) provided to the third port would "reverse" the effect.
Thus on-line editors grew more sophisticated as switchers evolved, and as more complicated transition-enabling accessory devices emerged. Soon editors were required to interface and control devices which allowed video wipes, flips, tumbles, the ability to key selected portions of one image onto another image, and characters to be generated on screen. The desired effects were programmed on the various special effects devices and operated under control from the editor in response to GPI pulses. The presence of these new accessory devices allowing more complex visual effects required greater performance from the editor. At the same time, there was a recurring problem of how to make the new devices "look like" a VTR to the editor for interface purposes. In essence, the design and performance of editors has historically been constrained by the de facto requirement that any interfaced peripheral device "look like" a VTR to the editor. While existing on-line editors can simultaneously control up to approximately 16 devices through serial ports, only one of the devices may be a video switcher.
The manufacturer of a VTR, switcher or other peripheral device provides a protocol instruction manual telling the user what editor control signals command what device functions. However the protocol for one manufacturer's VTR or device might be totally unlike the protocol for the same function on another manufacturer's VTR or device. Further, published protocol commands usually do not make full use of the peripheral device's capabilities, and frequently the VTR or device hardware might be updated by the manufacturer, thus making the published protocol partially obsolete.
The video industry has attempted to ameliorate the interface problem by adopting a few common protocols as standards, often with one peripheral device "emulating" the protocol requirements of another device. This emulation process was somewhat analogous to what has occurred with computer printer manufacturers, where the published escape and command codes for new printers often cause the printer to emulate industry standardized printers. However just as new printers often offered more flexibility and features than the older printers they emulated, new video peripheral devices frequently were capable of greater and more flexible performance than what was defined by their interface protocol. For example, the published protocol for a new digital disk recorder, a random access device, would list commands essentially emulating the performance of a VTR, a linear access device. As a result of this emasculated protocol, users were deprived of the many new features unique to a random access device. While a thorough understanding of the inner workings of the VTR or peripheral device would allow a user greater flexibility in obtaining maximum performance, the fact is that most video editor users are artistically rather than technically inclined.
A user could of course write software to better interface to a new device, thus allowing an editor to make maximum use of the new device's capabilities. However creating customized interface software is extremely time consuming and requires considerable expertise. For example, a customized software interface for a VTR (an established device whose capabilities are well understood) could take upwards of three man months to write, assuming that the VTR hardware and protocol manual were first fully understood. Even if the expense of a custom interface were undertaken, using a VTR from a different manufacturer would require rewriting the software. Thus, in practice, when a new peripheral device came to market, its manufacturer typically chose to adopt a standardized emulation rather than to bear the burden of writing a customized interface. As a result, the full capability of many new peripheral devices goes unrealized because the devices cannot fully and adequately communicate with the editor without a customized interface.
This lack of a universal approach for interfacing has continued to plague the industry. The problem is further compounded because users like to achieve video effects using tried and true techniques and combinations of equipment. However if a certain piece of equipment is temporarily unavailable to a user (the equipment may have broken, for example), the user may be unaware that all is not necessarily lost. The desired effect may still be achieved, perhaps by using what equipment is available and making multiple tape passes. Existing on-line editing systems are simply incapable of being told by the user what the desired effect is, and then making editing decisions for the user, based upon a knowledge of what equipment is at hand and a knowledge of the internal workings of that equipment.
As noted, both on-line and off-line editing systems generate an edit decision list or EDL. In existing on-line systems, the EDL is a complex collection of timecode numbers and cryptic designations for keying and dissolving operations. The timecode numbers give the precise time and frame number where events occur on the finished tape, as well as "in" and "out" times at which a given video source was put onto the finished tape. The operation designations simply state that at a given time frame, the editor issued a given command, "RECORD" for example, however what the visual effect resulting from the command cannot generally be ascertained.
At best a conventional EDL is a one dimensional historical time record of the most currently issued commands that resulted in changes made to the finished video tape. Although the output of an on-line editing system is video, it is surprising but true that existing EDLs contain no video image information. As a result, it is difficult for a user to examine an EDL and be able to predict what the visual image on screen will be at any given frame or time. In fact, where various video sources were superimposed or "layered" upon one another at different times, present EDLs make it almost impossible to predict the final image.
Also detrimental is the fact that the net effect of information contained in any portion of an EDL depends upon earlier occurring events. After a user completes intermediate edits and settles upon a finalized edit, prior art editor systems generate a "clean" EDL that removes all timecode overlaps and gaps, and produces a seamless EDL with a continuous timecode. As a result, information pertaining to the intermediate effects, including information pertaining to overlapped edit portions is irrevocably lost in existing EDLs.
The above limitations in existing systems prevent a user from going back and substantially re-editing the final tape to recover scenes intermediate to the final tape. For example, because conventional EDLs are flat, and only support a single video layer, they cannot adequately document the history of a layered image, and cannot "un-layer" or "re-layer" images, to create a different effect.
Also limiting is the fact that prior art editors are capable of storing only a few thousand lines or so of editing decisions. The EDL is further constrained because detailed information from the editor as to what various peripheral devices were doing at a given point is essentially non-existent. As noted, commonly the only information the editor conveys to the EDL is that a trigger pulse was sent at a certain time to a GPI to command an accessory device. Exactly what function the trigger pulse commanded is neither readily discernable nor easily reconstructed. Thus, lost and gone forever is an historical record of all the intermediate changes made with the on-line editor in arriving at the video images now on the tape. These hardware and software limitations within the editor prevent a user from readily going back and unlayering video, or deleting effects and recovering images formed intermediate to the final image.
Existing editing systems are also deficient in at least two other aspects: they do not allow for the simultaneous control of two or more edits or edit layers, and they do not allow multiple users on remote editing consoles to simultaneously edit on a single editing machine. While the video monitor used with existing systems can display a single set of parameters advising of the status of the editor, applicants are aware of but one existing system capable of a windowed display showing the current edit and a zoom image of a portion of that edit. However at present no system provides the capability to display two windows simultaneously, each displaying a separate edit or, if desired, one window being under control of a first on-line editor while the second window is under control of a second on-line editor.
Finally, editing systems require a user to keep track of numerous video tape sources, typically over certain frame ranges. Existing digital counters and light emitting diode (LED) bar graphs provide information only as to the tape's direction and speed. No information relating to the absolute position of a segment of video within the full tape is provided. Present editing systems do not provide a simple mechanical device capable of offering the accuracy of digital measurement, the ease of use of an analog device, while presenting tape source information in a relative and in an absolute fashion.
In summary, known on-line editors lack a true generic approach to the twin problems of readily achieving an interface with VTRs or peripheral devices, while simultaneously obtaining maximum performance and flexibility from the interfaced equipment. Further, known on-line editors lack the ability to control more than one video switcher, or simultaneously control through serial ports more than about 16 peripheral devices.
Further, existing on-line editors are unable to store all intermediate images and complete accessory device interface information. Such editors are unable to generate an EDL of unlimited length that is capable of providing a full and detailed historical record of all events resulting in the finished tape. As a result, known editors do not allow a user to instantly generate the final image corresponding to any point in the EDL, or to even predict what the image at any given time will be. Finally, existing editors lack the ability to control multiple simultaneous edits, the ability to permit multiple users to remotely make simultaneous edits on a single editor, and also lack motorized slide-type controls to provide absolute and relative information in a format that can be readily understood.