In audio signal processing, generating a time-varying measure of the audio signal level is often necessary. (Here the term “level” generically refers to a level measure such as peak level, rms level, loudness level, etc.) For example, a loudness meter may display a time-varying measure of an audio signal's perceptual loudness, where this measure is smoothed significantly in order to indicate the average loudness over the past several seconds. In another example, an Automatic Gain Control (AGC) process may compute a highly smoothed time-varying measure of an audio signal's level and then use the resulting measure to generate a slowly varying gain which, when applied to the audio signal, automatically moves the average level of the audio closer to a desired target level.
In these two of many examples, the smoothed level measure may be computed by applying some form of smoothing filter to a short-term level measure. (“Short-term” means computed over a time interval significantly shorter than the interval over which the subsequent smoothing is effective.) For example, one might compute the rms level of the signal or the perceptual loudness level, as described in the WO 2004/111994 A2 application, over an interval of tens of milliseconds to generate the short-term level. The subsequent smoothing of this short-term level might then involve time constants on the order of several seconds. In the following discussion, this time-varying short-term level measure is represented as the signal L[t], and the subsequently smoothed level measure is represented as L[t], where t represents the discrete time index.
Many different types of smoothing filters may be applied to L[t] to generate L[t]. One might use a finite impulse response (FIR) filter or a multi-pole infinite impulse response (IIR) filter. The particular filter employed is not critical. For illustrative purposes one may consider the commonly used fast-attack/slow-release single-pole IIR smoother. With such a filter, the smoothed level measure L[t] may be updated according to the equation:
                                          L            _                    ⁡                      [            t            ]                          =                  {                                                                                                                α                      attack                                        ⁢                                                                  L                        _                                            ⁡                                              [                                                  t                          -                          1                                                ]                                                                              +                                                            (                                              1                        -                                                  α                          attack                                                                    )                                        ⁢                                          L                      ⁡                                              [                        t                        ]                                                                                                                                                                                    L                      ⁡                                              [                        t                        ]                                                              -                                                                  L                        _                                            ⁡                                              [                                                  t                          -                          1                                                ]                                                                              >                  0                                                                                                                                                α                      release                                        ⁢                                                                  L                        _                                            ⁡                                              [                                                  t                          -                          1                                                ]                                                                              +                                                            (                                              1                        -                                                  α                          release                                                                    )                                        ⁢                                          L                      ⁡                                              [                        t                        ]                                                                                                                                                                                    L                      ⁡                                              [                        t                        ]                                                              -                                                                  L                        _                                            ⁡                                              [                                                  t                          -                          1                                                ]                                                                              ≤                  0                                                                                        (        1        )            The smoothing coefficients αattack and αrelease may be chosen such that αattack<αrelease. This means that L[t] tracks L[t] more quickly when L[t] is increasing (attack) in comparison to when L[t] is decreasing (release). For an AGC, one might, for example, choose αattack corresponding to a time constant of one second and a αrelease corresponding to a time constant of four seconds. This way, L[t] varies quite slowly over time, and as a result, the corresponding gain which modifies the audio also varies slowly, thereby maintaining the short-term dynamics of the original audio. Problems may arise, however, when using such large time constants. Suppose that such an AGC is operating on the audio of a television set with the intent of maintaining a consistent average level across programming and across various channels. In such a situation, the content of the audio signal being processed by the AGC may abruptly change, when a channel is changed for example, and the associated average level of the audio signal may therefore also abruptly change. With its large time constants, however, the AGC takes a considerable amount of time to converge to a new level and bring the modified level of the processed audio in line with the desired target level. During such adaptation time, a viewer of the television may perceive the level of the audio to be too loud or too soft. As a result, the viewer may quickly reach for the remote control to adjust the volume—only to find himself or herself fighting the AGC as it converges.
The prior art typically solves the problem just described using time constants that adapt based on the relative relationship of the short-term level L[t] to the smoothed level L[t]. For example, if the short-term level of the signal is significantly greater or less than the smoothed level as defined by some threshold boundaries around the smoothed level, then the smoothing operation switches to faster attack and/or release time constants, respectively, until the short-term level falls back within the threshold boundaries around the smoothed level. Subsequently, the system switches back to the original slower time constants. Equation 1 may be modified to implement this more sophisticated smoothing technique by including four cases rather than two:
                                          L            _                    ⁡                      [            t            ]                          =                  {                                                                                                                                                                                              α                            attackFast                                                    ⁢                                                                                    L                              _                                                        ⁡                                                          [                                                              t                                -                                1                                                            ]                                                                                                      +                                                                                                                                                                          (                                                      1                            -                                                          α                              attackFast                                                                                )                                                ⁢                                                  L                          ⁡                                                      [                            t                            ]                                                                                                                                                                                                                                      L                      ⁡                                              [                        t                        ]                                                              -                                                                  L                        _                                            ⁡                                              [                        t                        ]                                                                              >                                      Δ                    ⁢                                                                                  ⁢                                          L                      fast                                                                                                                                                                                                                                                                    α                            attack                                                    ⁢                                                                                    L                              _                                                        ⁡                                                          [                                                              t                                -                                1                                                            ]                                                                                                      +                                                                                                                                                                          (                                                      1                            -                                                          α                              attack                                                                                )                                                ⁢                                                  L                          ⁡                                                      [                            t                            ]                                                                                                                                                                                            0                  <                                                            L                      ⁡                                              [                        t                        ]                                                              -                                                                  L                        _                                            ⁡                                              [                        t                        ]                                                                              ≤                                      Δ                    ⁢                                                                                  ⁢                                          L                      fast                                                                                                                                                                                                                                                                    α                            release                                                    ⁢                                                                                    L                              _                                                        ⁡                                                          [                                                              t                                -                                1                                                            ]                                                                                                      +                                                                                                                                                                          (                                                      1                            -                                                          α                              release                                                                                )                                                ⁢                                                  L                          ⁡                                                      [                            t                            ]                                                                                                                                                                                                                                      -                      Δ                                        ⁢                                                                                  ⁢                                          L                      fast                                                        ≤                                                            L                      ⁡                                              [                        t                        ]                                                              -                                                                  L                        _                                            ⁡                                              [                        t                        ]                                                                              ≤                  0                                                                                                                                                                                                                              α                            releaseFast                                                    ⁢                                                                                    L                              _                                                        ⁡                                                          [                                                              t                                -                                1                                                            ]                                                                                                      +                                                                                                                                                                          (                                                      1                            -                                                          α                              releaseFast                                                                                )                                                ⁢                                                  L                          ⁡                                                      [                            t                            ]                                                                                                                                                                                                                                      L                      ⁡                                              [                        t                        ]                                                              -                                                                  L                        _                                            ⁡                                              [                        t                        ]                                                                              <                                                            -                      Δ                                        ⁢                                                                                  ⁢                                          L                      fast                                                                                                                              (        2        )            In Equation 2, αattackFast<αattack and αreleaseFast<αrelease meaning that αattackFast and αreleaseFast correspond to time constants faster than αattack and αrelease, respectively. If αattack and αrelease correspond to time constants of 1 and 4 seconds, respectively, then αattackFast and αreleaseFast may be chosen, for example, corresponding to time constants of 0.1 and 0.4 seconds, respectively (10 times faster). The fast time constant threshold ΔLfast must be chosen judiciously so that switching to these faster time constants does not occur too often and result in unwanted instability of the smoothed level L[t]. If, for example, the level measures L[t] and L[t] represent rms level in units of decibels, one might set ΔLfast at 10 dB, an approximate doubling of perceived loudness.
Though an improvement over the smoothing in Equation 1, the smoothing of Equation 2 may still perform sub-optimally for many signals. In general, for any reasonable threshold ΔLfast, signals may exist for which the original desired dynamics of the short-term level L[t] fluctuate outside of the threshold boundaries around the average level L[t], thus causing the smoothing process to falsely switch into the fast attack or release mode.
To better understand cases for which the smoothing of Equation 2 performs as desired and for which it performs inadequately, one can imagine the distribution of the short-term level L[t] over time. One may imagine this distribution as a time-varying probability density which predicts the probability of encountering any particular value of short-term level L within an interval of time around the current time index t. The duration of this interval should be commensurate with the slower set of time constants used in the smoothing filter of Equation 2.
Now consider the behavior of this probability density for the television-channel-change example described earlier. Assuming that the dynamic range of the short-term level for a given channel is somewhat limited, the probability density function of the short-term level L[t] takes the form of a fairly narrow hump positioned around the smoothed level L[t]. When the channel changes, and assuming that the average level of the new channel is significantly higher than the original, the probability density function will begin to change: the original hump decreases as a new hump grows positioned around the higher average level of the new channel.
FIG. 1 depicts such a probability density function at the beginning of the described transition. In this figure, the horizontal axis represents level and the vertical axis represents probability. The solid line represents the probability density of the short-term level at the beginning of the transition. One notes the decreasing hump on the left, representing the decreasing probability associated with the old channel selection, and the increasing hump on the right, representing the increasing probability associated with the new channel selection. At the beginning of the transition, the smoothed level L[t−1] still falls within the hump of the old channel selection, while the short-term level L[r] falls within the hump of the new channel. In this diagram, the short-term level L[t] is larger than L[t−1] by an amount greater than ΔLfast, and therefore, according to Equation 2, a fast time constant is used to update L[t] towards L[t]. This is the desired effect: the smoothed level L[t] adapts quickly to the higher level of the newly selected channel, traveling quickly across the gap dividing the two humps of the probability density.
FIG. 2 depicts the probability density of the short-term level for a very different audio signal. In this case, original dynamics of the signal are comparatively large, and therefore the hump of the probability density is spread quite wide. Such dynamics may be typical in a high-quality recording of jazz or classical music. Also in FIG. 2, the relationship between L[t−1] and L[t] is exactly the same as in FIG. 1, but now both values lie within the main hump of the probability density. Therefore, the switch to a fast time constant is not desired because this relation between L[t−1] and L[t] is part of the typical dynamics of the signal. In this case, the smoothing described by Equation 2 is not appropriate.