The invention generally relates to data compression, and more specifically relates to a form of entropy coding.
In a typical audio coding environment, data is represented as a long sequence of symbols which is input to an encoder. The input data is encoded by an encoder, transmitted over a communication channel (or simply stored), and decoded by a decoder. During encoding, the input is pre-processed, sampled, converted, compressed or otherwise manipulated into a form for transmission or storage. After transmission or storage, the decoder attempts to reconstruct the original input.
Audio coding techniques can be generally categorized into two classes, namely the time-domain techniques and frequency-domain ones. Time-domain techniques, e.g., ADPCM, LPC, operate directly in the time domain while the frequency-domain techniques transform the audio signals into the frequency domain where the actual compression happens. The frequency-domain codecs can be further separated into either subband or transform coders although the distinction between the two is not always clear. Processing an audio signal in the frequency domain is motivated by both classical signal processing theories and human perception models (e.g., psychoaoustics). The inner ear, specifically the basilar membrane, behaves like a spectral analyzer and transforms the audio signal into spectral data before further neural processing proceeds.
The frequency-domain audio codecs often take advantage of many kinds of auditory masking that are going on with the human hear system to modify the original signal and eliminate a great many details/redundancies. Since the human ears are not capable of perceiving these modifications, efficient compression is achieved. Masking is usually conducted in conjunction with quantization so that quantization noise can be conveniently xe2x80x9cmasked.xe2x80x9d In modern audio coding techniques, the quantized spectral data are usually further compressed by applying entropy coding, e.g., Huffman coding.
Compression is required because a fundamental limitation of the communication model is that transmission channels usually have limited capacity or bandwidth. Consequently, it is frequently necessary to reduce the information content of input data in order to allow it to be reliably transmitted, if at all, over the communication channel. Over time, tremendous effort has been invested in developing lossless and lossy compression techniques for reducing the size of data to transmit or store. One popular lossless technique is Huffman encoding, which is a particular form of entropy encoding.
Entropy coding assigns code words to different input sequences, and stores all input sequences in a code book. The complexity of entropy encoding depends on the number m of possible values an input sequence X may take. For small m, there are few possible input combinations, and therefore the code book for the messages can be very small (e.g., only a few bits are needed to unambiguously represent all possible input sequences). For digital applications, the code alphabet is most likely a series of binary digits {0, 1}, and code word lengths are measured in bits.
If it is known that input is composed of symbols having equal probability of occurring, an optimal encoding is to use equal length code words. But, it is not typical that an input stream has equal probability of receiving any particular message. In practice, certain messages are more likely than others, and entropy encoders take advantage of this to minimize the average length of code words among expected inputs. Traditionally, however, fixed length input sequences are assigned variable length codes (or conversely, variable length sequences are assigned fixed length codes).
The invention concerns using a variable-to-variable entropy encoder to code an arbitrary input stream. A variable-to-variable entropy encoder codes variable length input sequences with variable length codes. To limit code book size, entropy-type codes may be assigned to only probable inputs, and alternate codes used to identify less probable sequences.
To optimize searching the code book, it may be organized into sections that are searched separately. For example, one arrangement is to group all stored input sequences in the book according to the first symbol of the input sequence. A hash encoding function, collection of pointers, or other method may be used to immediately jump to a given section of the code book. Each section may further be sorted according to the probability associated with the entry. For example, each section may be sorted with highest probable inputs located first in the section, thus increasing the likelihood that a match will be found quickly.
Matching code cook entries depends on the internal representation of the book. For example, in a tree structure, nodes may represent each character of the input such that reaching a leaf signifies the end and identification of a particular grouping of input symbols. In a table structure, a pattern matching algorithm can be applied to each table entry within the appropriate section. Depending on the implementation of the table and matching algorithms, searching may be facilitated by recognition that only as many input symbols as the longest grouping in the code book section need to be considered. After finding a code book match, the corresponding entropy-type code can be output and the search repeated with the next symbol following the matched input.
Although the illustrated embodiments focus on encoding audio data, the input stream is expected to be any data stream, such as numbers, characters, or a binary data which encodes audio, video or other types of data. For simplicity, the input stream is referenced herein as a series of symbols, where each xe2x80x9csymbolxe2x80x9d refers to the appropriate measurement unit for the particular input. The input stream may originate from local storage, or from intranets, the Internet, or streaming data (e.g., Microsoft""s xe2x80x9cNETSHOWxe2x80x9d(trademark) client/server streaming architecture).