The invention relates to multimedia editing systems and more particularly computer based systems for composing and editing video and audio sequences comprising a plurality of components.
Computer based systems for composing and editing combined audio and video sequences are known. Recent systems accept input signals representing components, which are often called clips, and which are combined into an audio and video output sequence. The input signals are usually of several different types including audio, still images, video images, text and computer generated images. Computer generated images include still images and video sequences.
Video images can be captured from a video camera or may actually be composed by a system such as the system described in this invention. Still images may also be captured by a camera or may be digitised by a scanner from a photograph or print. All such signals are usually stored on an optical or magnetic medium, referred to as the source medium, in analogue or digital form. Analogue signals will be digitised before they are used as inputs to a composition and editing system. The input types described here are sometimes referred to as multimedia content and the possible input types are not restricted to the those described above.
The methods of digitising a still image which is either a real scene or a reproduction of a real scene, by raster scanning, are well known, particularly in relation to digital photography and printing. The methods of reproducing moving scenes by first electronically capturing a series of essentially still images, known as frames, are also well known, particularly in relation to television.
The computer used in this system may be a special purpose design or be of the type commonly known as a desk top machine such as a PC. It will include a central processing unit, memory for storage, a display system and user input controls. The types of user input controls which may be used include both common computer input devices, for example, a keyboard, mouse, tracker ball or joystick and editing devices specific to the video editing industry such as a jog shuttle wheel or audio mixing desk.
An operator who is working at the computer to assemble a presentation from portions of the available input clips requires simultaneous information about the input clips as well as information about the presentation defined so far in his work. In addition to this information the operator requires access to tools which define how the clips are to be included in the presentation. It is usual to display the various information and tools to the operator using an operator interface based on a windowing system. Such windowing systems are well known in personal computing systems.
Information about the individual input clips may be displayed in a window and may include a small image, sometimes called a thumbnail, which represents the first frame of a video sequence. Additional information regarding the type of content and duration is required. A means within the display is also provided for viewing or listening to the content of an individual clip and then selecting a portion to be used in the final presentation. Normally the portion initially selected is only approximately defined and a portion longer than needed for the final presentation will be selected. This process is called an assemble edit or rough cut.
The individual clips which have been rough cut may be modified and combined with other clips in order to achieve different visual and audio effects. The simplest of these is known as a cut and results in the frames of the final presentation consisting of the frames from a first clip immediately followed by the frames of a second clip. More complex transitions are common in which the last frames of the first clip are overlapped with the first frames of the second clip such that the first sequence gradually changes to the second sequence over a period of time. The effects in such transitions are usually selected from a library of possible transition types. In more complex presentations there may be two or more video and audio sequences combined for a period of time to achieve an artistic effect.
Other special effects for modifying or combining clips or adding titles or adding still images or adding animated images, etc. are well known.
It is common practice to provide, on a display, a type of window known as a time line to represent the sequence of the various clips and transitions which combine to form the final presentation. The time line displays used in current systems typically consist of a series of parallel rectangular elements, known as time bars, shown on the computer display. Horizontal distance across the time bars is usually used to represent time with the time bars arranged vertically. Time is normally represented linearly with distance.
During the editing process, the operator may define a number of tracks which may be video, audio or other media types. Each track will be defined on a time bar of the time line. These tracks are a well known concept in both video and audio editing and are used to control the composition of one or more source clips which may be used simultaneously to form part of a composed sequence. They are necessary when two or more source elements are present at one point in time in the output sequence, for example when audio and video are concurrent or during a special transition effect from one source clip to another.
Rough cut clips and transitions are added to the time line as the presentation is built up in order to control the final presentation and provide the operator with a diagrammatic representation of the content of the presentation. Each rough cut clip is adjusted by the operator to have start and ending points which are precisely defined with respect to the source media containing the clip and also precisely defined with respect to the final presentation. This process of precisely defining the clip is known as trimming. Several methods of trimming to achieve a desired programme effect are known.
Time can be measured in units of hours, minutes and seconds and frames. The addition of frame units is provided to allow the operator to achieve precise control of which frames are used in the presentation and to control their time relationship with the frames of other clips, transitions and special effects. Whereas hours minutes and seconds are common and precisely defined time units, frame units vary depending on the particular standard used in the target presentation. Typically they may be one thirtieth of a second or one twenty-fifth of a second. Time measured in hours minutes seconds and frames from the start of the source medium containing a clip, or a composed programme, is known as a time code.
As mentioned earlier, there is usually a means for displaying the source clips. It is also necessary to provide a means for viewing the presentation as it is built up whilst the operator works. These two requirements are often satisfied by providing two video display areas which are side by side in a single window. It is necessary to provide user controls to play the selected video clip or final composed presentation. Such controls may be provided to start or stop the video. Additionally controls are usually provided, in the form of sliders adjacent to the video display areas, which may be used to move forwards or backwards through video at a rate determined by the operator.
A similar facility is also provided for moving through the composed programme in the presentation display window by sliding a pointer on the time line. The presentation display slider and the viewing window slider can be synchronised to each other. Additionally a display of the slider position, and hence the displayed frame position, may be given by a time code display. When using the normal video or audio play controls, the video frames will be played at a rate appropriate for the standard chosen for the presentation or clip. Moving the time slider to play backwards and forwards through a sequence is known as scrubbing.
It will be appreciated that a computer display is limited in size and resolution and consequently the amount of information which can be displayed along a time line is limited. A typical display may be have a width of 300 mm which consists of 1024 pixels. During some editing operations an operator may wish to work on the time line with a display of the overall presentation, which may last more than one hour. In another operation the operator may wish to work on a sequence of individual frames which form a selected transition or special effect. It is not previously possible to provide a single time line which allows frame by frame viewing of the sequence at the same time as viewing the whole presentation. Current systems require that the scale of the time line is changed to suit the operation currently being performed.
The smallest scale must allow individual frames to be identified and the largest scale should provide a view, for example, of up to three hours across the whole display screen. In order to provide the operator with this range of time line views it may be necessary to provide up to 15 different scales for the time line ranging from a few frames per time line division up to several minutes per division.
It is difficult for the operator to choose which scale to use without using a trial and error process. When a scale is first selected the portion of the presentation of interest will often not be in view and it is necessary to scroll along the window in order to find the required portion. Having found the required portion it may then become apparent that the scale selected is not optimum for the desired operation. In this case yet another time scale selection must be made and the process repeated.
This is a disadvantage of this method of displaying and selecting time line information.
It will be appreciated that the time line window contains a number of different types of information about a section of the composed programme. The amount of detail which can be shown for each clip or transition depends upon the timescale chosen and the size of display available. The operator will continually be changing the timescale and position of the time line during editing and composing the programme. This results in repeated step changes in the scale of information provided. These characteristics of the time line type of operator interface cause operator confusion. The operator is presented with too much information in windows whose composition is continuously changing as a result of scale changes or scrubbing operations.
It is an object of the present invention to provide a time line system with improved usability, which does not require repeated time consuming changes of scale of the time line and which provides improved clarity of information presented to an operator.
According to the present invention, there is provided a system for editing together media segments using data processing means, in which there is provided a visual representation of a time line along which are arranged a plurality of elements representative of the media segments. There is a primary region in which the displayed linear extent per unit time is a maximum, and secondary regions to either side of the primary region in which the displayed linear extent per unit time is less than in the primary region. Means are provided whereby an element in a secondary region may be moved into the primary region.
In a preferred system there is a gradual decrease in the linear extent per unit time, i.e. temporal resolution on the display, from the primary region outwards. In one preferred arrangement the visual representation is provided with a perspective effect, seeming to curve away from a viewer. The height of the visual representation may decrease from the primary region outwards as a result of the perspective effect.
In the operation of such a system, a user can select an element which is outside the primary region and move it towards the primary region. As this happens, the element will lengthen, thus increasing the temporal resolution as it is moved towards an area where editing will take place. Preferably the arrangement is such that an element can be scrolled relatively smoothly along the time line, lengthening and shortening as it moves towards or away from the primary region.
Thus, it is possible to have access to a number of elements on the time line, whilst being able to carry out operations on selected elements whilst they are displayed at a suitably high temporal resolution. Where there is a gradual change in temporal resolution, which may be continuous or stepwise, it may not be necessary to move an element into the primary region of maximum resolution for editing to take place and there may be an adjacent region of suitably high temporal resolution.
In one preferred arrangement there is provided a supplementary display of information relating to a media segment which is represented by an element, and this may be activated when the element is selected. Selection of the element for the purpose of the supplementary display can preferably take place at any point along the time line and may be effected merely by placing a pointer over the element.
Thus a sequence on a time line can be displayed in a representation which both allows the whole sequence to be represented and which also allows fine control of editing without the need to switch time scaling of the time line during editing and composition operations. The preferred supplementary display, in conjunction with the time line, provides an improved operator interface.
Other objects and advantages of the present invention will become apparent from the following description of preferred embodiments taken in conjunction with the accompanying drawings.