Several systems are presently available for capture, editing and playback of motion video and associated audio. A particular category of such systems includes digital nonlinear video editors. Such systems store motion video data as digital data, representing a sequence of digital still images, in computer data files on a random access computer readable medium. A still image may represent a single frame, i.e., two fields, or a single field of motion video data. Such systems generally allow any particular image in the sequence of still images to be randomly accessed for editing and for playback. Digital nonlinear video editors have several benefits over previous video tape-based systems which provide only linear access to video information.
Since digital data representing motion video may consume large amounts of computer memory, particularly for full motion broadcast quality video (e.g., sixty field per second for NTSC and fifty fields per second for PAL), the digital data typically is compressed to reduce storage requirements. There are several kinds of compression for motion video information. One kind of compression is called "intraframe" compression which involves compressing the data representing each still image independently of other still images. Commonly-used intraframe compression techniques employ a transformation to the frequency domain from the spatial domain, for example, by using discrete cosine transforms. The resulting values typically are quantized and encoded. Commonly-used motion video compression schemes using intraframe compression include "motion-JPEG" and "I-frame only" MPEG. While intraframe compression reduces redundancy of data within a particular image, it does not reduce the significant redundancy of data between adjacent images in a motion video sequence. For intraframe compressed image sequences, however, each image in the sequence can be accessed individually and decompressed without reference to the other images. Accordingly, intraframe compression allows purely nonlinear access to any image in the sequence.
More compression can obtained for motion video sequences by using what is commonly called "interframe" compression. Interframe compression involves predicting one image using another. This kind of compression often is used in combination with intraframe compression. For example, a first image may be compressed using intraframe compression, and typically is called a key frame. The subsequent images may be compressed by generating predictive information that, when combined with other image data, results in the desired image. Intraframe compressed images may occur every so often throughout the sequence. Several standards use interframe compression techniques, such as MPEG-1(ISO/IEC 11172-1 through 5), MPEG-2(ISO/IEC 13818- 1 through 9) and H.261, an International Telecommunications Union (ITU) standard. MPEG-2, for example, compresses some images using intraframe compression (called I-frames or key frames), and other images using interframe compression techniques for example by computing predictive errors between images. The predictive errors may be computed for forward prediction (called P-frames) or bidirectional prediction (called B-frames). MPEG-2 is designed to provide broadcast quality full motion video.
For interframe compressed image sequences, the interframe compressed images in the sequence can be accessed and decompressed only with reference to other images in the sequence. Accordingly, interframe compression does not allow purely nonlinear access to every image in the sequence, because an image may depend on either previous or following images in the sequence. Generally speaking, only the intraframe images in the sequence may be accessed nonlinearly. However, in some compression formats, such as MPEG-2, some state information needed for decoding or displaying an intraframe compressed image, such as a quantization table, also may occur elsewhere in the compressed bitstream, eliminating the ability to access even intraframe compressed images nonlinearly.
One approach to handling the playback of serially dependent segments in an arbitrary sequence is described in U.S. Pat. No. 4,729,044, (Keisel). In this system, the dependency between images in a segment is due to the linear nature of the storage media, i.e., video tape. Several tapes containing the same material are used. For any given segment to be played back, an algorithm is used to select one of the tapes from which the material should be accessed. At the same time, a tape for a subsequent segment is identified and cued to the start of the next segment. As a result, several identical sources are processed in parallel in order to produce the final program.
In nonlinear systems, the need for multiple copies of video sources to produce arbitrary sequences of segments has been avoided by the random-access nature of the media. Arbitrary sequences of segments from multiple data files are provided by pipelining and buffering nonlinear accesses to the motion video data. That is, while some data is being decompressed and played back, other data is being retrieved from a data file, such as shown in U.S. Pat. No. 5,045,940 (Peters et al.).
In such systems, video segments still may need to be processed in parallel in order to produce certain special effects, such as dissolves and fades between two segments. One system that performs such effects is described in PCT Publication No. WO 94/24815 (Kurtze et al.). In this system, two video streams are blended by a function .alpha.A+(1-.alpha.)B wherein A and B are corresponding pixels in corresponding images of the two video streams. A common use of this system is to play segment A, and to cause a transition to segment B over several images. The data required for segment B is loaded into a buffer and decompressed while A is being played back so that decoded pixels for segment B are available at the time the transition is to occur. Similar systems also are shown in U.S. Pat. Nos. 5,495,291 (Adams) and 5,559,562 (Ferster). When using interframe compression, if a second segment starts with an interframe image, the processing of the second segment may have to begin earlier during processing of a previous first segment to allow the desired image of the second segment to be available. Ideally, the second segment should be processed from a previous intraframe compressed image. However, these preceding images are not used in the output.
A problem arises when a third segment of interframe and intraframe compressed video is to be played. In particular, the second segment must be long enough to allow the first image of the third segment to be completely processed from a previous intraframe compressed image. If only two channels of decoders are available, this processing for the third sequence would be performed using the same decoder used to process the first segment, after the first sequence is processed. In some cases, the first decoder also may output several images after the last desired image is output. The minimum size of any second segment is referred to as the cut density. While the cut density in principle can be reduced to a single field by using only intraframe compression, interframe compression provides better compression. Accordingly, it is desirable to minimize the cut density using interframe compression.
Another problem in designing a system that is compatible with some standards, such as MPEG-2, is that there are many options that may or may not be present in a coded bitstream. For example, an MPEG-2 formatted bitstream may include only I-frames, or I and P frames, or I, B and P frames. The order in which these frames is displayed also may be different from the order they are stored. Each compressed image also may result in the output of anywhere from zero to six fields. State information needed to decode any particular image, including an I-frame, may also occur at any point in the bitstream. As a result, the ability to randomly access a particular field in an arbitrary MPEG-2 compliant bitstream may be determined by the actual format of the bitstream.
Accordingly, a general aim of the present invention to provide a system which allows nonlinear editing of interframe and intraframe compressed motion video with a minimum cut density. Another general aim in one embodiment of the invention is to allow mixed editing of interframe and intraframe compressed data streams with different compression formats.