The present invention generally relates to the field of audio compression, and more particularly to a method and apparatus for audio compression which operates on dynamical systems, such as cellular automata (CA).
The need frequently arises to transmit digital audio data across communications networks (e.g., the Internet; the Plain Old Telephone System, POTS; Local Area Networks, LAN; Wide Area Networks, WAN; Satellite Communications Systems). Many applications also require digital audio data to be stored on electronic devices such as magnetic media, optical disks and flash memories. The volume of data required to encode raw audio data is large. Consider a stereo audio data sampled at 44100 samples per second and with a maximum of 16 bits used to encode each sample per channel. A one-hour recording of a raw digital stereo music with that fidelity will occupy about 606 Megabytes of storage space. To transmit such an audio file over a 56 kilobits per second communications channel (e.g., the rate supported by most POTS through modems), will take over 24.6 hours.
The best approach for dealing with the bandwidth limitation and also reduce huge storage requirement is to compress the audio data. The most popular technique for compressing audio data combines transform approaches (e.g. the Discrete Cosine Transform, DCT) with psycho-acoustic techniques. The current industry standard is the so-called MP3 format (or MPEG audio developed by the International Standards Organization International Electrochemical Committee, ISO/IEC) which uses the aforementioned approach. Various enhancements to the standard have been proposed. For example, Bolton and Fiocca, in U.S. Pat. No. 5,761,636, taught a method for improving the audio compression system by a bit allocation scheme that favors certain frequency subband. Davis, in U.S. Pat. No. 5,699,484, taught a split-band perceptual coding system that makes use of predictive coding in frequency bands.
Other audio compression inventions that are based on variations of the traditional DCT transform and/or some bit allocation schemes (utilizing perceptual models) include those taught by Mitsuno et al. (U.S. Pat. No. 5,590,108), Shimoyoshi et al (U.S. Pat. No. 5,548,574), Johnston (U.S. Pat. No. 5,481,614), Fielder and Davidson (U.S. Pat. No. 5,109,417), Dobson et al. (U.S. Pat. No. 5,819,215), Davidson et al. (U.S. Pat. No. 5,632,003), Anderson et al. (U.S. Pat. No. 5,388,181), Sudharsanan et al. (U.S. Pat. No. 5,764,698) and Herre (U.S. Pat. No. 5,781,888).
Some recent inventions (e.g., Dobson et al. in U.S. Pat. No. 5,819,215) teach the use of the wavelet transform as the tool for audio compression. The bit allocation schemes on the wavelet-based compression methods are generally based on the so-called embedded zero-tree concept taught by Shapiro (U.S. Pat. Nos. 5,321,776 and 5,412,741). Other audio compression schemes that utilize wavelets as basis functions are described in the paper by Painter and Spanias (1999) and they include the work by Tewik et al (1993a,b,c); Black and Zeytinoglu (1995); Kudumakis and Sandler (1995a,b); and Boland and Deriche (1995,1996).
In order to achieve a better compression of digital audio data, the present. invention makes use of a transform method that uses dynamical systems. In accordance with a preferred embodiment, the evolving fields of cellular automata are used to generate building blocks for audio data. The rules governing the evolution of the dynamical system can be adjusted to produce building blocks that satisfy the requirements of low-bit rate audio compression process.
The concept of cellular automata transform (CAT) is taught in U.S. Pat. No. 5,677,956 by Lafe, as an apparatus for encrypting and decrypting data. The present invention teaches the use of more complex dynamical systems that produce efficient building blocks for encoding audio data. The present invention also teaches a psycho-acoustic method developed specially for the sub-band encoding process arising from the cellular automata transform. A special bit allocation scheme that also facilitates audio streaming is taught as an efficient means for encoding the quantized transform coefficients obtained after the cellular automata transform process.
According to the present invention there is provided a method of compressing audio data comprising: determining a multi-state dynamical rule set and an associated transform basis function, receiving input audio data, and performing a forward transform using the transform basis function to obtain transform coefficients suitable for reconstructing the input audio data.
An advantage of the present invention is the provision of a method and apparatus for audio compression which provides improvements in the efficiency of digital media storage.
Another advantage of the present invention is the provision of a method and apparatus for audio compression which provides faster data transmission through communication channels.
Still another advantage of the present invention is the provision of a method and apparatus for audio compression which utilizes psycho-acoustics.
Yet another advantage of the present invention is the provision of a method and apparatus for audio compression which facilitates audio streaming.
Still other advantages of the invention will become apparent to those skilled in the art upon a reading and understanding of the following detailed description, accompanying drawings and appended claims.