This invention relates to hardware designs coupled with software-based algorithms for capture, compression, decompression, and playback of digital image sequences, particularly in an editing environment.
Video and audio source material editing systems employing digital techniques have been introduced over the last several years. One example is the Avid/1 Media Composer from Avid Technology, Inc., of Burlington, Mass. This media composer receives, digitizes, stores and edits video and audio source material. After the source material is digitized and stored, a computer such as an Apple Macintosh based computer manipulates the stored digital material and a pair of CRT monitors are used for displaying manipulated material and control information to allow editing to be performed. Later versions of the media composer included compression techniques to permit the display of full motion video from the digitized source material. Compression was achieved using a JPEG chip from C-Cube of Milpitas, Calif. That data compression is described more fully below. Although previous media composers could achieve full motion video from digitized sources, the compression degraded image quality below desirable levels. Further, the media composer lacked features which enhance the editing process.
The idea of taking motion video, digitizing it, compressing the digital datastream, and storing it on some kind of media for later playback is not new. RCA's Sarnoff labs began working on this in the early days of the video disk, seeking to create a digital rather than an analog approach. This technology has since become known as Digital Video Interactive (DVI).
Another group, led by Phillips in Europe, has also worked on a digital motion video approach for a product they call CDI (Compact Disk Interactive). Both DVI and CDI seek to store motion video and sound on CD-ROM disks for playback in low cost players. In the case of DVI, the compression is done in batch mode, and takes a long time, but the playback hardware is low cost. CDI is less specific about the compression approach, and mainly provides a format for the data to be stored on the disk.
A few years ago, a standards-making body known as CCIIT, based in France, working in conjunction with ISO, the International Standards Organization, created a working group to focus on image compression. This group, called the Joint Photographic Experts Group (JPEG) met for many years to determine the most effective way to compress digital images. They evaluated a wide range of compression schemes, including vector quantization (the technique used by DVI) and DCT (Discrete Cosine Transform). After exhaustive qualitative tests and careful study, the JPEG group picked the DCT approach, and also defined in detail the various ways this approach could be used for image compression. The group published a proposed ISO standard that is generally referred to as the JPEG standard. This standard is now in its final form, and is awaiting ratification by ISO, which is expected.
The JPEG standard has wide implications for image capture and storage, image transmission, and image playback. A color photograph can be compressed by 10 to 1 with virtually no visible loss of quality. Compression of 30 to 1 can be achieved with loss that is so minimal that most people cannot see the difference. Compression factors of 100 to 1 and more can be achieved while maintaining image quality acceptable for a wide range of purposes.
The creation of the JPEG standard has spurred a variety of important hardware developments. The DCT algorithm used by the JPEG standard is extremely complex. It requires converting an image from the spatial domain to the frequency domain, the quantization of the various frequency components, followed by Huffman coding of the resulting components. The conversion from spatial to frequency domain, the quantization, and the Huffman coding are all computationally intensive. Hardware vendors have responded by building specialized integrated circuits to implement the JPEG algorithm.
One vendor, C-Cube of San Jose, Calif., has created a JPEG chip (the CL550B) that not only implements the JPEG standard in hardware, but can process an image with a resolution of, for example, 720.times.488 pixels (CCIRR 601 video standard) in just 1/30th of a second. This means that the JPEG algorithm can be applied to a digitized video sequence, and the resulting compressed data can be stored for later playback. The same chip can be used to compress or decompress images or image sequences. The availability of this JPEG chip has spurred computer vendors and system integrators to design new products that incorporate the JPEG chip for motion video. However, the implementation of the chip in a hardware and software environment capable of processing images with a resolution of 640.times.480 pixels or greater at a rate of 30 frames per second in an editing environment introduces multiple problems.
For high quality images, a data size of 15-40 Kbytes per frame is needed for images at 720.times.488 resolution. This means that 30 frames per second video will have a data rate of 450 to 1200 Kbytes per second. For data coming from a disk storage device, this is a high data rate, requiring careful attention to insure a working system.
The most common approach in prior systems for sending data from a disk to a compression processor is to copy the data from disk into the memory of the host computer, and then to send the data to the compression processor. In this method, the computer memory acts as a buffer against the different data rates of the compression processor and the disk. This scheme has two drawbacks. First, the data is moved twice, once from the disk to the host memory, and another time from the host memory to the compression processor. For a data rate of 1200 Kbytes per second, this can seriously tax the host computer, allowing it to do little else but the data copying. Furthermore, the Macintosh computer, for example, cannot read data from the disk and copy data to the compression processor at the same time. The present invention provides a compressed data buffer specifically designed so that data can be sent directly from the disk to the
With the JPEG algorithm, as with many compression algorithms, the amount of data that results from compressing an image depends on the image itself. An image of a lone seagull against a blue sky will take much less data than a cityscape of brick buildings with lots of detail. Therefore, it becomes difficult to know where a frame starts within a data file that contains a sequence of frames, such as a digitized and compressed sequence of video. This creates particular problems in the playback from many files based on edit decisions. With fixed size compression approaches, one can simply index directly into the file by multiplying the frame number by the frame size, which results in the offset needed to start reading the desired frame. When the frame size varies, this simple multiplication approach no longer works. One needs to have an index that stores the offset for each frame. Creating this index can be time consuming. The present invention provides an efficient indexing method.
It is often desirable to vary the quality of an image during compression in order to optimize the degree of data compression. For example, during some portions of a sequence, detail may not be important, and quality can be sacrificed by compressing the data to a greater degree. Other portions may require greater quality, and hence this greater degree of compression may be unsuitable. In prior implementations of the JPEG algorithm, quality is adjusted by scaling the elements of a quantization table (discussed in detail hereinbelow). If these elements are scaled during compression, they must be correspondingly m-scaled during decompression in order to obtain a suitable image. This re-scaling is cumbersome to implement and can cause delays during playback. According to one aspect, the present invention is a method that allows for quality changes during compression to enable optimum data compression for all portions of a sequence, while allowing playback with a single quantization table.