The present invention generally relates to the field of video data compression, and more particularly to a method and apparatus for video data compression which operates dynamical systems, such as cellular automata (CA).
At the most primitive level digital video is three-dimensional (3D) data consisting of the xe2x80x9cflowxe2x80x9d of two-dimensional images (i.e., xe2x80x9cframesxe2x80x9d) over time (FIG. 11). Thirty frames per second (fps) is the standard rate considered to define a fairly good quality video. Eighteen fps will be acceptable for certain situations. High definition video demands rates on the order of 60 fps.
The challenge involved in compressing video data is daunting. Consider a video frame of 320xc3x97240 pixels. For a 24-bit color, each frame will has 3xc3x97320xc3x97240=1,843,200 bits of information. Assuming 30 fps, each second of the video contains 1,843,200xc3x9730=55,296,000 bits (or 6,912,000 bytes) of data. If this video were to be transmitted over the Plain Old Telephone System (POTS) line through a 56 kilobits per second (kps) modem then the compression required to receive the video in real time is 55,296,000/(56xc3x971,024)=964:1. Alternatively, to store one hour of this video uncompressed will require a storage space of 6,912,000xc3x9760xc3x9760 bytes=23 Gbytes. A digital video stream with 640xc3x97480 frames will require four times the compression or storage requirement outlined above. Therefore, the need for fast and effective compression is apparent.
The best approach for dealing with the bandwidth limitation and also reduce huge storage requirement is to compress the data. Since video is a conglomeration of individual picture frames, typical video compression methods are largely defined by: 1) the way the individual reference frames are encoded, and 2) the technique for relating/predicting intermediate frames together given the information about the reference frames. Some of the most popular techniques for compressing image data combine transform approaches (e.g. the Discrete Cosine Transform, DCT) with psycho-visual techniques. The current industry standard is the so-called JPEG (Joint Photographic Expert Group) format, which is based on DCT.
Some recent inventions (e.g., U.S. Pat. No. 5,881,176 to Keith et al) teach the use of the wavelet transform as the tool for image compression. The bit allocation schemes on the wavelet-based compression methods are generally based on the so-called embedded zero-tree concept taught by Shapiro (U.S. Pat. Nos. 5,321,776 and 5,412,741). Other image compression schemes that utilize wavelets as transform basis functions are described by Ferriere (U.S. Pat. No. 5,880,856), Smart et al. (U.S. Pat. No. 5,845,243), and Dobson et al (U.S. Pat. No. 5,819,215).
Prior patents that specifically address video compression include those of Greene (U.S. Pat. No. 5,838,377), which uses the wavelets approach; and Agarwal (U.S. Pat. No. 5,729,691) who taught the use of conglomeration of transforms (including DCT, Slaar and Haar transforms) for video compression.
In order to achieve a better compression/decompression of digital image data, the present invention makes use of a transform method that uses a dynamical system, such as cellular automata transforms (CAT). The evolving fields of cellular automata are used to generate xe2x80x9cbuilding blocksxe2x80x9d for image data. The rules governing the evolution of the dynamical system can be adjusted to produce xe2x80x9cbuilding blocksxe2x80x9d that satisfy the requirements of low-bit rate image compression process.
The concept of cellular automata transform (CAT) is taught by Lafe in U.S. Pat. No. 5,677,956, as an apparatus for encrypting and decrypting data. The present invention uses more complex dynamical systems that produce efficient xe2x80x9cbuilding blocksxe2x80x9d for encoding video data. A special bit allocation scheme that also facilitates compressed data streaming is provided as an efficient means for encoding the quantized transform coefficients obtained after the cellular automata transform process.
According to the present invention there is provided a method of compressing digital video data comprising the steps of: (a) determining a multi-state dynamical rule set and an associated transform basis function; (b) receiving M frames of input video data; (c) performing a forward transform using the transform basis function to obtain transform coefficients suitable for reconstructing the M frames of input video data; and (d) encoding the transform coefficients.
According to another aspect of the present invention, there is provided an apparatus for compressing video data comprising: means for determining a multi-state dynamical rule set and an associated transform basis function; means for receiving M frames of input video data; means for performing a forward transform using the transform basis function to obtain transform coefficients suitable for reconstructing the M frames of input video data; and means for encoded the transform coefficients.
According to yet another aspect of the present invention, there is provided a method of embedded band-based threshold coding, comprising the steps of: (1) arranging transform coefficients associated with each of M frames of video data into sub-bands; (2) grouping the transform coefficients in all M frames belonging to the same sub-band; (3) establishing a target error equal to Emax; (4) determining a magnitude of the transform coefficient with the largest value in all M frames throughout all the sub-bands (Tmax); (5) establishing a Threshold=2n greater than Tmax, where n is an integer; and (6) performing steps (a), (b) and (c) while Threshold greater than Emax: (a) marching from the coarsest sub-band to the finest in all M frames, and determining the maximum transform coefficient (Tb) in each sub-band for all M frames; (b) if Tb less than Threshold encoding YES and move onto the next sub-band, otherwise encoding NO and proceeding to check each transform coefficient in the sub-bands for all M frames, wherein (i) if the transform coefficient value is less than Threshold encoding YES, otherwise encoding POSV if transform coefficient is positive or NEGV if it is not, and (ii) decreasing the magnitude of the transform coefficient by Threshold; and (iii) setting Threshold to Threshold/2.
According to yet another aspect of the present invention, there is provided an apparatus for embedded band-based threshold coding comprising: (1) means for arranging transform coefficients associated with each of M frames of video data into sub-bands; (2) means for grouping the transform coefficients in all M frames belonging to the same sub-band; (3) means for establishing a target error equal to Emax; (4) means for determining a magnitude of the transform coefficient with the largest value in all M frames throughout all the sub-bands (Tmax); (5) means for establishing a Threshold=2n greater than Tmax, where n is an integer; and (6) means for performing steps (a), (b) and (c) while Threshold greater than Emax:
(a) marching from the coarsest sub-band to the finest in all M frames, and a determining the maximum transform coefficient (Tb) in each sub-band for all M frames;
(b) if Tb less than Threshold encoding YES and move onto the next sub-band, otherwise encoding NO and proceeding to check each transform coefficient in the sub-bands for all M frames, wherein
(i) if the transform coefficient value is less than Threshold encoding YES, otherwise encoding POSV if transform coefficient is positive or NEGV if it is not, and
(ii) decreasing the magnitude of the transform coefficient by Threshold; and
(iii) setting Threshold to Threshold/2.
According to still another aspect of the present invention, there is provided a method of embedded band-based threshold coding, comprising the steps of: (1) arranging transform coefficients associated with each of M frames of video data into sub-bands; (2) grouping the transform coefficients in all M frames belonging to the same sub-band, (3) determining a TargetSize equal to a target compression ratio CR; (4) determining a magnitude of the transform coefficient with the largest value in all M frames throughout all the sub-bands (Tmax); (5) establishing a Threshold=2n greater than Tmax, where n is an integer; (6) setting OutputSize equal to zero; and (7) performing steps (a), (b) and (c) while OutptuSize greater than Tmax: (a) marching from the coarsest sub-band to the finest in all M frames, and determining the maximum transform coefficient (Tb) in each sub-band for all M frames; (b) if Tb less than Threshold encoding YES and move onto the next sub-band, otherwise encoding NO and proceeding to check each transform coefficient in the sub-bands for all M frames, wherein (i) if the transform coefficient value is less than Threshold encoding YES, otherwise encoding POSV if transform coefficient is positive or NEGV if it is not, and (ii) decreasing the magnitude of the transform coefficient by Threshold; and (c) setting Threshold to Threshold/2.
An advantage of the present invention is the provision of a method and apparatus for digital video compression which provides improvements in the efficiency of digital media storage.
Another advantage of the present invention is the provision of a method and apparatus for digital video compression which provides faster data transmission through communication channels.
Still other advantages of the invention will become apparent to those skilled in the art upon a reading and understanding of the following detailed description, accompanying drawings and appended claims.