At the most primitive level digital video is three-dimensional (3D) data consisting of the "flow" of two-dimensional images (i.e., "frames") over time (FIG. 11). Thirty frames per second (fps) is the standard rate considered to define a fairly good quality video. Eighteen fps will be acceptable for certain situations. High definition video demands rates on the order of 60 fps.
The challenge involved in compressing video data is daunting. Consider a video frame of 320.times.240 pixels. For a 24-bit color, each frame will has 3.times.320.times.240=1,843,200 bits of information. Assuming 30 fps, each second of the video contains 1,843,200.times.30=55,296,000 bits (or 6,912,000 bytes) of data. If this video were to be transmitted over the Plain Old Telephone System (POTS) line through a 56 kilobits per second (kps) modem then the compression required to receive the video in real time is 55,296,000/(56.times.1,024)=964:1. Alternatively, to store one hour of this video uncompressed will require a storage space of 6,912,000.times.60.times.60 bytes=23 Gbytes. A digital video stream with 640.times.480 frames will require four times the compression or storage requirement outlined above. Therefore, the need for fast and effective compression is apparent.
The best approach for dealing with the bandwidth limitation and also reduce huge storage requirement is to compress the data. Since video is a conglomeration of individual picture frames, typical video compression methods are largely defined by: 1) the way the individual reference frames are encoded, and 2) the technique for relating/predicting intermediate frames together given the information about the reference frames. Some of the most popular techniques for compressing image data combine transform approaches (e.g. the Discrete Cosine Transform, DCT) with psycho-visual techniques. The current industry standard is the so-called JPEG (Joint Photographic Expert Group) format, which is based on DCT.
Some recent inventions (e.g., U.S. Pat. No. 5,881,176 to Keith et al) teach the use of the wavelet transform as the tool for image compression. The bit allocation schemes on the wavelet-based compression methods are generally based on the so-called embedded zero-tree concept taught by Shapiro (U.S Pat. Nos. 5,321,776 and 5,412,741). Other image compression schemes that utilize wavelets as transform basis functions are described by Ferriere (U.S. Pat. No. 5,880,856), Smart et al. (U.S. Pat. No. 5,845,243), and Dobson et al (U.S. Pat. No. 5,819,215).
Prior patents that specifically address video compression include those of Greene (U.S. Pat. No. 5,838,377), which uses the wavelets approach; and Agarwal (U.S. Pat. No. 5,729,691) who taught the use of conglomeration of transforms (including DCT, Slaar and Haar transforms) for video compression.
In order to achieve a better compression/decompression of digital image data, the present invention makes use of a transform method that uses a dynamical system, such as cellular automata transforms (CAT). The evolving fields of cellular automata are used to generate "building blocks" for image data. The rules governing the evolution of the dynamical system can be adjusted to produce "building blocks" that satisfy the requirements of low-bit rate image compression process.
The concept of cellular automata transform (CAT) is taught by Lafe in U.S. Pat. No. 5,677,956, as an apparatus for encrypting and decrypting data. The present invention uses more complex dynamical systems that produce efficient "building blocks" for encoding video data. A special bit allocation scheme that also facilitates compressed data streaming is provided as an efficient means for encoding the quantized transform coefficients obtained after the cellular automata transform process.