The present invention relates generally to the field of lossless (i.e., entropy) coding for storage or transmission of, for example, source coded speech, audio, or video signals, and more particularly to a method and apparatus for performing entropy coding based on variable-size symbol vectors for achieving higher compression ratios.
Source coding is an essential step in modern digital communication networks and systems. Specifically, it is used to convert sources like speech, audio, video and many other analog waveforms (signals) into a digital representation (i.e., a sequence of bits), and may further compress this representation into a shorter bit stream. This digital representation (i.e., the bit stream) may then be used either for purposes of efficient storage for subsequent decoding and use, or for purposes of efficient transmission for decoding at the other end of a communications channel. The source encoder which creates and/or compresses the digital representation of the signal, and the decoder which ultimately synthesizes a reconstruction of the original signal from the (possibly compressed) digital representation, are jointly designed to meet certain application-dependent performance criteria. Most notably, the decoded source should advantageously be of a satisfactory quality (e.g., to the ear or the eye), while the information rate (i.e., the number of bits used in the representation of a given portion of the original signal) is at or below the capacity of the storage or transmission medium. Other important criteria may include those related to, for example, robustness, delay, complexity, price, etc.
The encoding process is often carried out in two steps. The first step comprises xe2x80x9clossyxe2x80x9d transformation of the analog data into discrete symbols defined over a finite alphabet. By xe2x80x9clossyxe2x80x9d it is meant that there is information content which is contained in the original signal but not in the digital representation (i.e., the sequence of discrete symbols) produced. The second step comprises a xe2x80x9closslessxe2x80x9d compression of the discrete symbol data, which amounts to describing exactly the same data (i.e., with no loss of information content), but with fewer symbols (ie., fewer bits). This second step is commonly referred to as entropy coding (xe2x80x9cECxe2x80x9d) because it attempts to reduce the information content to that inherent in the symbol source, as measured by the source entropy. Often the distinction between these two steps is vague or impossible. (See, e.g., xe2x80x9cEntropy-constrained vector quantizationxe2x80x9d by P. A. Chou et al., IEEE Trans. Acoust., Sp. and Sig. Proc. 37(1), pp. 31-42, January 1989.) Sometimes, one step is entirely missing, as in certain standardized speech coders, such as, for example, in International Telecommunication Union (ITU) standards G.728 and G.729, where no entropy coding is used, or in the case of conventional file compression techniques, such as, for example, the Ziv-Lempel technique and its derivatives, where no lossy coding is used. (ITU standards G.728 and G.729, as well as the Ziv-Lempel file compression technique and its derivatives, are each fully familiar to those of ordinary skill in the art.) In the simplest entropy coding applications, the source symbols are processed individually (referred to as xe2x80x9cper-letter ECxe2x80x9d). This may, for example, be accomplished using techniques such as Huffman coding or arithmetic coding, each of which is fully familiar to those of ordinary skill in the art. More complex EC coders parse the source output sequence into fixed or variable-size strings or vectors. These new vector-symbols are then losslessly coded. This approach is referred to as vector entropy coding (xe2x80x9cVECxe2x80x9d) as opposed to the simpler per-letter EC. Typically, the most common use of VEC is in the Ziv-Lempel family of file compression coders, although their use in other coders has also been proposed. (See, e.g., xe2x80x9cGeneralized Tunstall Codes for Sources With Memoryxe2x80x9d by S. A. Savari et al., IEEE Trans. IT, Vol. 43 No. 2, pp. 658-667, March 1997.) The advantage of VEC is in its use of inter-symbol dependencies to achieve high compression ratios, which results from the fact that the entropy of the combined symbols is never greater than that of the elementary symbols and most of the times it is significantly lower. The longer the vector-symbols (i.e., the higher the number of elementary symbols included in a given vector), the higher the coding efficiency that can be achieved, compared to that of per-letter EC. On the other hand, the coding complexity of VEC often grows exponentially with the vector size and may quickly become unmanageable. Moreover, VEC coders usually require a considerable xe2x80x9clook-aheadxe2x80x9d of long future data strings before coding can be performed on a given vector. In communications applications, this may translate into a long coding delay and large data buffering requirements, which impair communication efficiency. Therefore, VEC has been primarily employed for off-line applications like file compression (e.g., for purposes of storage).
For the above reasons, on-line communications applications, in which fast and relatively inexpensive processing is usually required, have most typically employed per-letter EC coding, although various techniques have been proposed to make this per-letter encoding as efficient as possible. (See, e.g., xe2x80x9cLossless Coding for Audio Discsxe2x80x9d by P. Craven et al., J. Audio Eng. Soc., Vol. 44 No. 9, pp. 706-720, September 1996.) In some modern audio coders, however, an attempt is made to employ the VEC concept in a simple, restricted way. For example, the quantizer symbols which are generated by the lossy compression portion of the audio coder may be grouped in vectors of various predetermined sizes, and these new xe2x80x9ccompositexe2x80x9d symbols may then, for example, be Huffman coded. (See, e.g., xe2x80x9cNoiseless Coding of Quantized Spectral Components in MPEG-2 Advanced Audio Codingxe2x80x9d by S. R. Quackenbush et al., IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA ""97, Session 3, Paper No. 3, 1997, which groups quantizer symbols in vectors of size 1, 2 or 4, and Huffman codes the new composite symbols.)
It would be advantageous if an enhanced VEC technique were available in which higher compression rates than those of prior art techniques were achieved while maintaining a reasonable level of complexity. In particular, a technique which provided for variable-size vector entropy coding (referred to herein as xe2x80x9cVSVECxe2x80x9d) in an easy and efficient manner would be highly desirable.
In accordance with the present invention, an illustrative method and apparatus is provided in which the simplicity of radix arithmetic may be advantageously employed to effectuate a low-complexity variable-size vector entropy coding (VSVEC) technique which achieves high compression rates, by coding each (variable-size) vector with use of a calculated xe2x80x9ccombinedxe2x80x9d symbol. In particular, the illustrative technique in accordance with the present invention advantageously permits variable-size vectors to be entropy coded based on the particular symbols in the vector, based on a size of a set from which combined symbols are to be selected for coding (e.g., an xe2x80x9calphabetxe2x80x9d size), and, in accordance with certain illustrative embodiments, based either on a variable, determined numerical radix value, or, alternatively, on a fixed, predetermined numerical radix value. As such, the vector size may be as small as a single symbol, or may be as large as an entire frame of a source (e.g., speech, audio or video) signal (which may, for example, comprise several hundred or even several thousand symbols).
Specifically, the encoding technique of the present invention uses the numerical values of a subsequence of individual symbols to be coded, together with a size of a set of combined symbols, in order to determine the length (ie., the number of included symbols) of a first subsequence of symbols, which is then coded with use of a single (a first) combined symbol selected from the set; and uses the numerical values of another sequence of individual symbols to be coded, together with the size of the set of combined symbols, in order to determine the length (i.e., the number of included symbols) of a second subsequence of symbols, which is then also coded with a single (a second) combined symbol. Moreover, the number of symbols in (i.e., the length of) the first subsequence and the number of symbols in (i.e., the length of) the second subsequence are unequalxe2x80x94that is, the subsequences of symbols which are combined and coded together are of a variable length.
In accordance with certain illustrative embodiments of the present invention, the lengths of the first subsequence of symbols and of the second subsequence of symbols may also be based on a first numerical radix and a second numerical radix, respectively. The first radix and the second radix may be equal and fixed at a predetermined value, or they may each be determined based on the corresponding subsequence of symbols to be coded.
Correspondingly, the decoding technique of the present invention determines from the bit stream the number of symbols which have been coded with use of a single combined symbol (i.e., the length of a coded subsequence), and based on that number, on the combined symbol itself, and on a given radix (which may be fixed or which may also be determined from the bit stream), determines the values of the individual symbols which were coded together as the combined symbol.
More specifically, the present invention comprises a method and apparatus for performing entropy coding of a sequence of symbols, each of said symbols having a numerical value associated therewith, the method or apparatus comprising steps or means for identifying a first subsequence of said symbols, the first subsequence of said symbols having a first number of symbols included therein, the first number of symbols being based on the numerical values associated with said symbols included in said first subsequence of said symbols and further based on a size of a set of combined symbols; coding the first subsequence of symbols with use of a first combined symbol representative of said first subsequence of symbols and with use of a symbol representative of the first number of symbols, wherein said first combined symbol is a member of said set of combined symbols; identifying a second subsequence of said symbols, the second subsequence of said symbols having a second number of symbols included therein, the second number of symbols being based on the numerical values associated with said symbols included in said second subsequence of said symbols and further based on said size of said set of combined symbols, wherein said second number of symbols differs from said first number of symbols; and coding the second subsequence of symbols with use of a second combined symbol representative of said second subsequence of symbols and with use of a symbol representative of the second number of symbols, wherein said second combined symbol is a member of said set of combined symbols.
In addition, the present invention comprises a method and apparatus for decoding a bit stream, the bit stream comprising an entropy encoding of an original sequence of symbols, each of said symbols in said original sequence having a numerical value associated therewith, the method or apparatus comprising steps or means for decoding a portion of said bit stream which comprises a coded symbol representative of a number of symbols which are included in a given subsequence of said original sequence and which have been coded with use of a combined symbol, thereby determining said number of symbols which have been combined in said subsequence; decoding a portion of said bit stream which comprises said coded combined symbol representative of said subsequence of symbols in said original sequence, said subsequence containing said determined number of symbols; and determining said numerical values associated with said symbols in said subsequence based on the decoded portion of said bit stream which comprises said coded combined symbol, and further based on said determined number of symbols and on a numerical radix.