The invention relates in general to digital signal processing of audio signals, and more particularly, to efficiently encoding and decoding audio signals intended for human perception.
There is a considerable interest in the field of signal processing to discover methods that minimize the amount of information required to adequately represent a given signal. By reducing the amount of required information, signals may be transmitted over communication channels with lower bandwidth, or stored in less space. The informational requirements, or minimum amount of information needed for a digital signal processing method may be defined by the minimum number of binary bits required to represent a signal.
A variety of generally adequate coding techniques for audio signals are known. For example, two protocols, which perform coding for audio signals, are AC3 and MP3. Both AC3 and MP3 protocols utilize a modified discrete cosine transform (MDCT) to encode and decode audio signals. A modified discrete cosine transform is an orthogonal lapped transform, based on the idea of time domain aliasing cancellation (TDAC). One characteristic of an orthogonal transform is that it utilizes the same transform window at both the encoder and decoder.
In a typical design for coding of audio signals, encoders use high precision arithmetic to encode the audio signal. If MDCT is used to transform the audio signals, high precision arithmetic is also required at the decoder to perform the inverse transform. Utilizing such high precision arithmetic at both the encoder and the decoder generally requires a significant amount of memory storage space and processing power. Another characteristic of AC3 and MP3 protocols is that they utilize only two different window sizes for all MDCT transforms, limiting their flexibility and providing fewer possibilities when coding audio signals. Furthermore, the use of only two transform window sizes tends to limit the coding performance and sound quality of the audio signal.
A typical design for coding of audio signals, such as MP3 and AC3, may also include a stereo processing subsystem, which attempts to exploit both irrelevancy and redundancy in a stereo signal. One potential shortcoming with MP3 and AC3 is that they both use independent stereo processing subsystems to exploit irrelevancy and redundancy. Multiple approaches to stereo processing subsystems add to decoder complexity. Another limitation with the MP3 and AC3 independent stereo processing subsystem is that they both use a simple sum/difference stereo encoding method to exploit redundancy that is appropriate only if the dominant direction of the stereo signal is in the center.
Furthermore, efficient lossless coding is essential for achieving high compression. AC3 uses a relatively simple coding system based on Pulse Code Modulation (PCM) coding. MP3 uses a more sophisticated system which uses adaptive Huffman coding and a number of codebooks designed to code symbols with different distributions. However, both AC3 and MP3 lossless coding schemes have limitations which reduce their efficiency.
The invention provides a system and method for efficient encoding and decoding audio signals intended for human perception that substantially eliminates or reduces certain disadvantages from previous digital signal processing systems and methods. The invention may include transforming the signal into a plurality of bi-orthogonal modified discrete cosine transform (BMDCT) frequency coefficients using a bi-orthogonal modified discrete cosine transform; quantizing the BMDCT frequency coefficients to produce a set of integer numbers which represent the BMDCT frequency coefficients; and encoding the set of integer numbers to lower the number of bits required to represent the BMDCT frequency coefficients.
The invention may also include generating an inverse BMDCT transform of the audio signal using low precision arithmetic at the decoder. Using this approach, less memory storage is required to represent the inverse BMDCT transform window. As a result, the decoder will operate more efficiently.
The invention may further include the use of one of three different window types to perform each BMDCT transform, providing better sound quality on the receiving end of the process.
The invention may further provide a unified approach to exploiting stereo irrelevancy and stereo redundancy. This simplifies the decoder by implementing only one algorithm for stereo signal processing.
The invention may also examine future sub-frames to determine the best window type to utilize for each BMDCT transform.
The invention may also include the use of a very flexible and efficient lossless coding system that uses a combination of Huffman coding and Pulse Coded Modulation (PCM) to code different parts of the spectrum. This would minimize the number of bits required to represent the audio signal.