The present invention relates generally to signal processing systems, and more specifically to a refined system and method for allocating bits in an audio encoder such as an MPEG encoder.
Implementing an effective and efficient method of encoding audio data is often a significant consideration for designers, manufacturers, and users of contemporary electronic systems. The evolution of modern audio technology has necessitated corresponding improvements in sophisticated, high-performance audio encoding methodologies. For example, the advent of recordable audio compact disc devices typically requires an encoder-decoder (codec) system to receive and encode source audio data into a format (such as MPEG) that may then be recorded onto appropriate media using the compact disc device.
Many portions of the audio encoding processes are subject to strict technological standards that do not permit system designers to vary the data formats or encoding techniques. Other segments of the audio encoding process may not be altered because the encoded audio data must conform to certain specifications so that a standardized decoding device is able to successfully decode the encoded audio data. These foregoing constraints create substantial limitations for system designers who wish to improve the performance of an audio encoding device.
Transparent reproduction of audio data into the appropriate format is the ultimate goal of most audio encoding systems. The main factor which prevents an encoding system from attaining this goal are the artifacts introduced to the audio data during the encoding process. In other words, an audio decoder must be able to decode the encoded audio data for transparent reproduction by an audio playback system without introducing any sound artifacts created by the encoding and decoding process.
Digital audio encoders typically process and compress sequential units of audio data called xe2x80x9cframes.xe2x80x9d A particularly objectionable sound artifact called a xe2x80x9cdiscontinuityxe2x80x9d may be created when successive frames of audio data are encoded with non-uniform amplitude or frequency components. Each frame contains a large amount of varying audio information. Therefore treating the varying audio information contained within a frame as one large uniform unit can force some of the subtleties of the audio data to be lost. Additionally, treating each frame as a uniform unit can introduce larger discontinuities between successive frames. The discontinuities become readily apparent to the human ear whenever the encoded audio data is decoded and reproduced by an audio playback system.
Furthermore, to effectively encode audio data, the audio encoder must allocate a finite number of binary digits (bits) to the frequency components of the audio data, so that the encoding process achieves optimal representation of the source audio data. An efficient bit allocation technique which prevents discontinuity artifacts would thus provide significant advantages to an audio decoder device.
A paper entitled xe2x80x9cA Real-Time PC-Vased High Quality MPEG Layer II Codecxe2x80x9d by Laurent Mainard, et al., presented at the 101st Convention of the Audio Engineering Society, Nov. 8-11, 1996, proposed restrictions on the allocated/non-allocated state switching based on the evolution of the scalefactors. However, this article did not account for all audio artifacts which may arise with input audio data.
The present invention relates to a system and method which serves as a refinement in the criteria used to improve the performance of audio signal processing systems. More specifically, the present invention provides a system and method by which the frequency and magnitude of artifacts added to audio signal data in an encoder device can be reduced. The input audio data is filtered into sub-bands. A masking threshold is generated for each sub-band. The bit allocation criteria is applied to each sub-band based on the signal to masking ratios (SMRs) of successive sub-bands. Thus, artifacts which may arise because of discontinuities between subsequent sub-bands may be prevented.
In the preferred embodiment of the present invention, the encoding device through which the audio signal passes includes a filter bank for filtering source audio data to produce frequency sub-bands, a psycho-acoustic modeler for calculating signal to masking ratios from the frequency sub-bands of the source audio data, and a bit allocator which uses the signal to masking ratios to assign a finite number of bits to represent the frequency sub-bands. In the absence of a significant event, the bit allocator performs a pre-bit allocation procedure to prevent artifacts or discontinuities in the encoded audio data.
In accordance with the present invention, an encoder filter bank initially divides frames of received source audio data into frequency sub-bands. In the preferred embodiment, the filter bank preferably generates thirty-two discrete sub-bands per frame, and then provides the sub-bands to a psycho-acoustic modeler and a bit allocator.
The psycho-acoustic modeler of the preferred embodiment receives the filtered audio data for the frequency sub-bands and uses it to generate signal to masking ratios, and then provides these signal to masking ratios to the bit allocator. Next, the bit allocator identifies the first sub-band of the first frame received from the filter bank, and allocates a finite number of bits to this sub-band using a bit allocation process. The bit allocator then advances to the next successive sub-band, which would be the first sub-band of the second frame of audio data.
The bit allocator then checks the new current sub-band for a significant event, In the preferred embodiment, the bit allocator detects a significant event whenever the difference in signal to masking ratios of successive sub-bands (the current sub-band and the immediately preceding sub-band) exceeds a selectable threshold value. Other criteria for determining a significant event are likewise contemplated for use with the present invention. The bit allocator may also compute a bit release time depending on the absolute value of the difference in Signal to masking ratios. To further detect signal perturbations, the difference in signal to mask ratios may be filtered with a low-pass filter.
If the bit allocator detects a significant event in the current sub-band, then the bit allocator performs the bit allocation procedure referred to above. However, if the bit allocator does not detect a significant event in the current sub-band, then the bit allocator performs a pre-bit allocation procedure. In the preferred embodiment, when no event is detected, the bit allocator assigns to the current sub-band the same bit which was assigned to the immediately preceding sub-band during the bit allocation procedure.
The process of either performing the bit allocation or pre-bit allocation procedures are continued until no more bits remained which can be assigned to the sub-bands of the audio data. The present invention thus efficiently and effectively refines the criteria by which bits are allocated to audio data and thus further refines a method for preventing artifacts in an audio data encoder device.