An object of audio dynamics processing is to alter the level or dynamics of an audio signal to be within some desired limits. This is generally achieved by creating a time-varying measure of an audio signal's level (rms level or peak level, for example) and then computing and applying a time-varying signal modification (a gain change, for example) that is a function of the level estimate. Dynamics processors employing such a mode of operation include automatic gain controls (AGCs), dynamic range controls (DRCs), expanders, limiters, noise gates, etc. Various types of signal dynamics processing are set forth in International Patent Application PCT/US 2005/038579 of Alan Jeffrey Seefeldt, published as WO 2006/047600 on May 4, 2006. The application designates the United States among other entities. The application is hereby incorporated by reference in its entirety.
FIG. 1 depicts a high level block diagram of a generic audio dynamics processor. The processor may be considered to have two paths, an upper “signal” path 2 and a lower “control” path 4. On the lower path, a dynamics control process or controller (“Dynamics Control”) 6 measures the level of the audio signal and generates one or more time-varying modification parameters as a function of the level measure. As shown, the modification parameters are derived from the input audio signal. Alternatively, the modification parameters may be derived from the processed (output) audio or from a combination of the input and output audio signals. On the upper audio path 2, the modification parameters generated by the Dynamics Control 6 are applied to the audio to generate the processed audio. The application of modification parameters to an audio signal may be accomplished in many known ways and is shown generically by the multiplier symbol 8. For example, in the case of a simple automatic gain control device or process, there may be a single wideband gain modification parameter that controls the gain of a variable gain/loss device or process in the main path. In practice, the audio may also be delayed prior to the application of the modification parameters in order to compensate for any delay associated with the computation of the modification parameters in the dynamics control process. For simplicity in presentation, a delay is not shown in FIG. 1 or other figures herein.
In a dynamics control process, it is typical that both the signal level measure and the resulting modification parameters are computed continuously over time. In addition, either or both the signal level measure and the modification parameters are usually smoothed across time to minimize perceptible artifacts from being introduced into the processed audio. The smoothing is most often performed using a “fast attack” and a “slow release”, meaning that the modification parameters change relatively quickly in response to an increase in the signal level and respond more slowly as the signal level decreases. Such smoothing is in accordance with the dynamics of natural sounds and the way in which humans perceive changes in loudness over time. Consequently, such time smoothing is nearly universal in audio dynamics processors.
For some dynamics processing applications, the time constants associated with such smoothing may be quite large; on the order of one or more seconds. An AGC, for instance, may compute an estimate of the long-term average level of a signal using large time constants and then use the resulting estimate to generate slowly varying modification parameters that move the average level of the audio closer to a desired target level. In this case, large time constants may be desirable in order to preserve the short-term dynamics of the audio signal. Suppose that such an AGC is operating on the audio of a television set with the intent of maintaining a consistent average level across programming and across various channels. In such a situation, the content of the audio signal being processed by the AGC may abruptly change or have a discontinuity, when a channel is changed for example, and the associated average level of the audio signal may therefore also abruptly change or have a discontinuity. With its large time constants, however, the AGC takes a considerable amount of time to converge to a new level and bring the modified level of the processed audio in line with the desired target level. During such adaptation time, a viewer of the television may perceive the level of the audio to be too loud or too soft. As a result, the viewer may quickly reach for the remote control to adjust the volume—only to find himself or herself fighting the AGC as it converges.
A typical prior art solution to the problem just described involves using time constants that adapt based on the dynamics of the signal. For example, if the short-term level of the signal is significantly greater or less than the smoothed level as defined by some threshold boundaries around the smoothed level, then the smoothing operation switches to faster attack and/or release time constants, respectively, until the short-term level falls back within the threshold boundaries around the smoothed level. Subsequently, the system switches back to the original slower time constants. Such a system may reduce the adaptation time of the AGC, but the thresholds and shorter time constants must be chosen carefully. In general, for any reasonable thresholds, signals may exist in which the original desired signal dynamics fluctuate outside of the threshold boundaries around the average level, thus causing the smoothing process to falsely switch into the fast attack or release mode. Due to the possibly frequent occurrence of such false switching, the fast attack and release mode time constants must not be chosen to be too short in order to avoid instability of the AGC during normal program material. As a result, the convergence of the AGC during abrupt transitions or discontinuities in the audio content may still not be as fast as desired.
It is therefore the object of the present invention to provide a better solution to the problem of dynamics processing adaptation time during audio content changes.