The multimedia research community has traditionally focused its efforts on the compression, transport, storage and display of multimedia data. These technologies are fundamentally important for applications such as video conferencing and video-on-demand. The results of these efforts have made their way into many commercial products. For example, JPEG and MPEG, described below, are ubiquitous standards from image and audio/video compression.
There are, however, problems in content-based retrieval and understanding, video production, and transcoding for heterogeneity and bandwidth adaptation. The lack of a high-performance library, or "toolkit", that can be used to build processing-intensive multimedia applications is hindering development in multimedia applications. In particular, in the area of video-editing, large volumes of data need to be stored, accessed and manipulated in an efficient manner. Also, special hardware, such as MPEG accelerators, are needed for video processing applications. Solutions to the problems of storing video data include client-server applications and editing over the World Wide Web (Web). Web-based video-editing is particularly desirable because it allows access to data stored in many different repositories, and special hardware may be distributed. With Web-based video-editing, any computer with Internet access may be used to do video-editing because no special storage capability or processing capability is needed at the local level. The existing multimedia toolkits, however, do not have sufficiently high performance to make Web-based applications practical.
The data standards GIF, JPEG and MPEG dominate image and video data in the current state of the art. GIF (Graphics Interchange Format) is a bit-mapped graphics file format used commonly on the Web. JPEG (Joint Photographic Experts Group) is the internationally accepted standard for image data. JPEG is designed for compressing full color or gray-scale still images. For video data, including audio data, the international standard is MPEG (Moving Picture Experts Group). MPEG is actually a general reference to an evolving series of standards. For the sake of simplicity, the various MPEG versions will be referred to as the "MPEG standard" or simply "MPEG". The MPEG standard achieves a high rate of data compression by storing only the changes from one frame to another instead of an entire image.
The MPEG standard has four types of image coding for processing, the I-frame, the P-frame, the B-frame and the D-frame (from an early version of MPEG, but absent in later standards).
The I-frame (Intra-coded image) is self-contained, i.e. coded without any reference to other images. The I-frame is treated as a still image, and MPEG uses the JPEG standard to encode it. Compression in MPEG is often executed in real time and the compression rate of I-frames is the lowest within the MPEG standard. I-frames are used as points for random access in MPEG streams.
The P-frame (Predictive-coded frame) requires information of the previous I-frame in an MPEG stream, and/or all of the previous P-frames, for encoding and decoding. Coding of P-frames is based on the principle that areas of the image shift instead of change in successive images.
The B-frame (Bi-directionally predictive-coded frame) requires information from both the previous and the following I-frame and/or P-frame in the MPEG stream for encoding and decoding. B-frames have the highest compression ratio within the MPEG standard.
The D-frame (DC-coded frame) is intra-frame encoded. The D-frame is absent in more recent versions of the MPEG standard, however, applications are still required to deal with D-frames when working with the older MPEG versions. D-frames consist only of the lowest frequencies of an image. D-frames are used for display in fast-forward and fast-rewind modes. These modes could also be accomplished using a suitable order of I-frames.
Video information encoding is accomplished in the MPEG standard using DCT (discrete cosine transform). This technique represents wave form data as a weighted sum of cosines. DCT is also used for data compression in the JPEG standard.
Currently, there are several inadequate options from which to choose in order to make up for the lack of a high-performance multimedia toolkit. First, code could be developed from scratch as needed in order to solve a particular problem, but this is difficult given the complex multimedia standards such as JPEG and MPEG. Second, existing code could be modified but this results in systems that are complex, unmanageable, and generally difficult to maintain, debug, and reuse. Third, existing standard libraries like ooMPEG of the MPEG standard, or Independent JPEG Group (IJP) of the JPEG standard could be used, but the details of the functions in these libraries are hidden, and only limited optimizations can be performed.
It remains desirable to have a high-performance toolkit for multi-media processing.
It is an object of the present invention to provide a method and apparatus to enable client-server video-editing.
It is another object of the present invention to provide a method and apparatus to enable Web-based video-editing.