Several systems are presently available for capture, editing and playback of motion video and associated audio. A particular category of such systems includes digital nonlinear video editors. Such systems store motion video data as digital data, representing a sequence of digital still images, in computer data files on a random access computer readable medium. A still image may represent a single image, a single frame, i.e., two fields, or a single field of motion video data. An image may be progressive or interlaced. Such systems generally allow any particular image in the sequence of still images to be randomly accessed for editing and for playback. Digital nonlinear video editing systems have several benefits over previous video tape-based systems which provide only linear access to video information.
Because digital data representing motion video may consume large amounts of computer memory, particularly for full motion broadcast quality video (e.g., sixty field per second for NTSC and fifty fields per second for PAL), the digital data typically is compressed to reduce storage requirements. There are several kinds of compression for motion video information. One kind of compression is called xe2x80x9cintraframexe2x80x9d compression which involves compressing the data representing each still image independently of other still images. Commonly-used intraframe compression techniques employ a transformation to the frequency domain from the spatial domain, for example, by using discrete cosine transforms. The resulting values typically are quantized and encoded. Commonly-used motion video compression schemes using intraframe compression include xe2x80x9cmotion-JPEGxe2x80x9d and xe2x80x9cI-frame onlyxe2x80x9d MPEG. Although intraframe compression reduces redundancy of data within a particular image, it does not reduce the significant redundancy of data between adjacent images in a motion video sequence. For intraframe compressed image sequences, however, each image in the sequence can be accessed individually and decompressed without reference to the other images. Intraframe compression allows purely nonlinear access to any image in the sequence.
Motion video sequences can be compressed more by using what is commonly called xe2x80x9cinterframexe2x80x9d compression which reduces information redundancy between images. Interframe compression, for example, may involve storing information for predicting one image using another. This kind of compression often is used in combination with intraframe compression. For example, a first image may be compressed using intraframe compression, and typically is called a key frame. The subsequent images may be compressed by generating predictive information that, when combined with other image data, results in the desired image. Intraframe compressed images may occur occasionally so often throughout a sequence of compressed images. Several standards use interframe compression techniques, such as MPEG-1(ISO/IEC 11172-1 through 5), MPEG-2(ISO/IEC 13818- 1 through 9) and H.261, an International Telecommunications Union (ITU) standard. MPEG-2, for example, compresses some images using intraframe compression (called I-frames or key frames), and other images using interframe compression techniques for example by computing predictive errors between images. The predictive errors may be computed for forward prediction (called P-frames) or bidirectional prediction (called B-frames). Other examples of interframe compression formats include Compact Video (also known as Cinepak), CD-I and DVI.
For interframe compressed image sequences, the interframe compressed images in the sequence can be accessed and decompressed only with reference to other images in the sequence. Accordingly, interframe compression does not allow purely nonlinear access to every image in the sequence, because an image may depend on either previous or following images in the sequence. Generally speaking, only the intraframe images in the sequence may be accessed nonlinearly. However, in some compression formats, such as MPEG-2, some state information used for decoding or displaying an intraframe compressed image, such as a quantization table, also may occur elsewhere in the compressed bitstream, eliminating the ability to access even intraframe compressed images nonlinearly.
Another problem arises with the use of some standards, such as MPEG-2, in which there are many options that may or may not be present in a coded bitstream. For example, an MPEG-2 formatted bitstream may include only I-frames, or I and P frames, or I, B and P frames. The order in which these frames is displayed also may be different from the order they are stored. Each compressed image also may be used to produce anywhere from zero to six fields. State information used to decode any particular image, including an I-frame, may also occur at any point in the bitstream. As a result, the ability to randomly access a particular field in an arbitrary MPEG-2 compliant bitstream may be determined by the actual format of the bitstream.
Random access to arbitrary images, whether frames or fields, of a video segment compressed using both interframe and intraframe techniques may be enhanced by including state information, for decoding and display, at appropriate points in the compressed bitstream to enable random access to each intraframe compressed image. The state information may be inserted during compression or by processing the bitstream of compressed data.
An image index also may be generated that maps each temporal image in a decompressed output image sequence to an offset in the compressed bitstream of the data used to decode the image. The index may be created during compression or by processing the bitstream of compressed data.
To access one or more samples starting with a specified point in time in a decompressed output image sequence, the index is accessed using the specified point in time to identify another sample in the decompressed output image sequence for which data is used to produce the specified sample. The identity of the other sample is used to access the index to identify a location in the compressed data for the data used to produce the specified sample.