The present invention relates to the field of audio processing, and more particularly to audio compression.
To provide a desired loudness, it may be useful to apply gain to an audio signal if the magnitude of the audio signal falls below a threshold. On the other hand, when it appears that the magnitude may exceed a predetermined clipping level, the magnitude of the output signal may be clipped for safety reasons, such as limiting the sound pressure level. Compression of audio input signals may thus be used to reduce the dynamic range of an audio input signal, while reducing clipping.
Conventional compressors may have a gain adjusting device and a control system which controls the gain as a function of the input signal to reduce the level differences between high and low intensity portions of an audio signal. A time varying gain, that varies depending on the amplitude of the input signal, may then be used to compress the input signal. Therefore, a compressor may be used to reduce the gain in the audio path when the signal level exceeds a predetermined threshold. This can reduce the likelihood that the output signal may exceed a predetermined clipping level.
In compressing the audio signal, however, it may be desirable to avoid hard-clipping or xe2x80x9cflat-toppingxe2x80x9d on peaks of the audio signal. Hard-clipping typically occurs when the magnitude of the output signal exceeds the clipping level. Hard-clipping can, however, undesirably decrease the intelligibility of audio signals. For example, in a noisy environment, if hard-clipping occurs in the receive path, the user may not be able to find a comfortable volume setting since a setting high enough to hear a remote voice over the noise may cause unacceptable distortion. Providing sufficient loudness, while preserving intelligibility, may make it desirable to compress rather than clip a signal to reduce and/or eliminate distortion.
According to conventional compression techniques, input and output levels can be determined based on either RMS detection, peak detection, and/or a combination of RMS and peak detection. In each case, the input and output levels can be based on past measurements of the input. These past measurements of the input can then be used to control the gain of a linear amplifier thereby providing compression. As a result, conventional compressors may base compression only on past information about the input signal. Moreover, since conventional compression techniques may utilize the RMS value of the input signal to control compression, merely adding noise to a given voice signal may cause gain to be reduced making it difficult to hear the voice signal, as may occur, for instance, in the AMPS system.
Compression for instantaneous input samples can be determined, for example, using RMS measurements based on a plurality of past input samples. In particular, a compression can be chosen so that a logarithm of an RMS measurement of expected output samples is a function of a logarithm of RMS measurements of past input samples. For example, given a compression of 0.5, there could be a one decibel change in the output for every two decibel change in the input. Because the compression is determined based on RMS measurements of past input samples, however, clipping may result if instantaneous input values exceed an expected amplitude.
Referring to FIG. 1, shown is a graph illustrating a relationship between the logarithm of an instantaneous system input and a logarithm of an instantaneous system output for a conventional compressor as discussed above. According to FIG. 1, compression can be performed in such a way that there is a linear relationship between logarithms of the input (log(abs(x))) and the output (log(abs(y))). The horizontal line (CL) represents the clipping level of the compressor. Any signal exceeding the clipping level may be clipped by the compressor.
The slope of the line (m1) remains one regardless of the magnitude of the audio input signal and clipping level of the compressor. Thus, to reduce clipping of the audio input signal, a compressor operating as illustrated in FIG. 1 may shift the value of y-intercept (b) downward. The value of the y-intercept (b1) is equal to the logarithm of the system gain. Thus, in these conventional compression techniques, the y-intercept (b) of the line can be adjusted to compress the input below a threshold clipping level (CL) while the slope (m1) of the line remains one.
Stated differently, when it appears probable (based on measurement of past inputs) that the product of the instantaneous input and gain may exceed the clipping level (CL), the compressor may reduce the gain to translate the line A downward along the Y axis thereby mapping line A to line B. In turn, the intersection point between the line B and the line representing the clipping level (CL) changes such that the product of logarithm of the maximum input signal and the logarithm of the system gain (i.e., the logarithm of the output) may be less than or equal to the clipping level of the compressor. As a result, the input signal can be compressed such that the logarithm of the output will likely fall below the clipping level, thereby reducing the likelihood of distortion caused by clipping.
As discussed above, conventional compression techniques may determine a shift of the y-intercept (b) (e.g., gain) for the present input based on measurements of previous inputs. As a result, conventional compression techniques may cause clipping on the leading edges of the voice signal when the gain has not yet stabilized. These conventional techniques may also cause a phenomenon known as noise pumping.
According to conventional compression techniques, the uncompressed input/output function may be parallel to the compressed input/out function, and the y-intercept of the compressed input/output function may be merely translated along the y-axis to manipulate the y-intercept of the compressed input/output function to reduce the probability that the clipping limit of the compressor is exceeded by a particular output sample. However, when an input signal increases rapidly, for example, when the user speaks after a period of silence, then clipping may occur because initial compression for the speech is based on measurements of silence.
Conventional compression techniques may also provide a hold/release time after a peak where the linear gain remains low. Low-level speech that follows high-level speech may be amplified as much as the high-level speech, and as a result, the low-level speech may not be sufficiently amplified. In addition, when fluctuating noise is added to the received voice signal, conventional compression techniques may reduce the gain based on the RMS level of the signal when noise is added into the audio signal. However, higher gain may be desired to understand the speech with the added noise.
Methods of compressing an audio signal are provided. According to one of these methods, input samples of the audio signal can be accepted, and these input samples include non-zero input samples. A logarithm of each of the non-zero input samples of the audio signal can be calculated, and compressed output samples for each non-zero input sample can be determined based on the logarithm of each respective non-zero input sample. A linear relationship may exist between logarithms of the non-zero input samples and logarithms of the corresponding compressed output samples. A logarithm of each compressed output sample, corresponding to a non-zero input sample, may be based on a product of a logarithm of each corresponding non-zero input sample and a compression factor.
The logarithm of each compressed output sample corresponding to a non-zero input sample can be based on the product of the logarithm of each corresponding non-zero input sample and the compression factor plus a logarithm of a system gain. The compression factor may be based on at least one of the logarithm of a clipping level, a logarithm of a system gain, and a logarithm of an absolute value of a peak non-zero input sample. More particularly, the compression factor can be based on a difference between the logarithm of the clipping level and the logarithm of the system gain, wherein the difference is divided by the logarithm of the absolute value of the peak input sample.
The peak input sample can be either one of the plurality of input samples for which a compressed output sample is calculated, or a peak input sample prior to the input samples for which compressed output samples are determined. The system gain can also be variable for each of the non-zero input samples. A measurement for each of the non-zero samples may be provided, and the step of determining the compression for each of the non-zero input samples may then be further based on the measurement for one of the samples. Preferably, one of the non-zero input samples having a peak absolute value is determined, such that the step of determining the compression for each of the non-zero samples is further based on the peak absolute value. The non-zero samples may be included in a frame of samples.
According to yet another method of compressing an audio signal, a frame including input samples of the audio signal can be accepted wherein the frame of input samples includes non-zero input samples. A measurement of one of the non-zero input samples of the frame can be provided, and compressed output samples can be determined for each non-zero input sample of the frame based on the measurement of one of the non-zero input samples. When the measurement of one of the non-zero input samples of the frame is provided, an absolute value of a peak non-zero input sample of the frame may be determined, and compressed output samples for each non-zero input sample of the frame can then be determined based on the absolute value of the peak non-zero input sample of the frame. A logarithm of the absolute value of the peak non-zero input sample of the frame can then be calculated.
Determining compressed output samples for each non-zero input sample of the frame may comprise determining compressed output samples for each non-zero input sample of the frame based on the logarithm of the absolute value of the peak non-zero input sample of the frame. A linear relationship may exist between logarithms of the non-zero input samples and logarithms of the corresponding compressed output samples. A logarithm of each compressed output sample corresponding to a non-zero input sample may be based on a product of a logarithm of each corresponding non-zero input sample and a compression factor plus a logarithm of the system gain. The compression factor may be equal to the logarithm of the peak absolute value of the non-zero input sample of the frame. The system gain can be variable for each non-zero input sample. Providing a measurement of one of the non-zero input samples of the frame can include determining an absolute value of a peak product of each non-zero input sample of the frame and a corresponding variable system gain.
According to other methods of compressing an audio signal, input samples of the audio signal can be accepted wherein the input samples include non-zero input samples. An absolute value of a peak non-zero input sample can then be provided, and compressed output samples for at least a plurality of the non-zero input samples can be determined based on the absolute value of the peak non-zero input sample. The input samples maybe included in a frame of input samples, and the absolute value of the peak non-zero input sample of the frame of input samples may be provided. A logarithm of the absolute value of the peak non-zero input sample may be calculated. Preferably, a linear relationship may exist between logarithms of the non-zero input samples and logarithms of the corresponding compressed output samples. A logarithm of each compressed output sample can be based on a product of a logarithm of a corresponding non-zero input sample and a compression factor plus a logarithm of a system gain. Again, the system gain can be variable for each non-zero input sample. The input samples may comprise at least two frames of input samples and the peak non-zero input sample can be from a first frame, while the compressed output samples may correspond to non-zero input samples of a second frame.
As discussed above, embodiments of the invention may include a compressor that looks at a present buffer of input samples and then computes a compression factor based on a peak input sample. The compressor can then compute the log of each input sample and use the peak value to calculate corresponding output samples. Compressors according to the present invention can thus use compression operations based on a buffer of samples to look at the future. In other words, a compression may be applied to each input sample/such that the corresponding output samples do not exceed the clipping level.
Compression methods of the present invention may also provide greater intelligibility than conventional compression techniques. One advantage of the logarithmic relationship between the compressed samples and the corresponding input samples is that low amplitude portions of the compressed audio signal may be relatively unaffected by compression, while high amplitude portions of the audio signal may be more significantly affected by compression. Stated differently, in conventional compression techniques, both soft and loud sounds may be attenuated equally regardless of magnitude. Low-level parts of speech may be more important, however, than high-level parts of speech for intelligibility purposes. According to the present invention, loud sounds (i.e., high amplitude portions of the audio signal) and soft sounds (i.e., low amplitude portions of the audio signal) may have the different gains applied thereto. As a result, the sound generated may be loud yet clear.
Compression methods according to the present invention may reduce hard-clipping of a peak input sample when a peak follows a long period of no speech or low-level speech. Moreover, pumping of background noise in the audio signal may be reduced. For example, when speech follows a long period of no speech, the noise may not quickly drop, and then slowly increase when the speech ends.