In recent years the topic of “loudness” has increasingly drawn the attention of broadcasting corporations.
Heretofore, the acoustical parameter that has been most commonly used to describe the volume of an audio signal has been the signal's maximum or peak audio level (maximum program peak). The maximum (peak) amplitude of a signal has been a popular measure because it is proportional to the maximum sound pressure level when the signal is played back. Those of ordinary skill in the art will appreciate, though, that such a measure may not properly characterize the perceived volume of the associated audio signal.
The fact that, psycho-acoustically, music that is perceived to be louder gets more attention has been, and continues to be, increasingly exploited. This has predictably led to instances where music providers, radio stations, producers seek to push their own musical content more into the foreground by increasing its volume. Thus, one negative side effect has been an increased competition between radio stations, individual music titles, commercials etc., to secure the highest loudness which has led to degradation in the overall audio quality. This is because, among others, such strong fluctuations in loudness are unpleasant to a listener. Additionally, the increase in loudness at a constant maximum peak level demands dynamics compression, which, when used intensely, leads to pumping noise (e.g., gain pumping can occur after a regularly occurring high amplitude transient such as a kick drum), distortion and other sound artifacts.
Because of this the ITU (International Telecommunications Union), the EBU (European Broadcast Union) and the ATSC (Advanced Television Systems Committee, Inc., ATSC A/85 is incorporated by reference herein) have published guidelines that are directed to standardizing methods for the determination of the loudness and which provide the parameters/values for the distribution of audio programs for broadcast (ITU-R BS.1770, and EBU R128 attached hereto as Appendices A, and B, respectively, and incorporated fully herein by reference as if set out at this point). In the USA the harmonization of the loudness measurements has additionally been mandated by legislation (CALM-Act) and via implementation of standard A/85. In coming years it is to be expected that these standards will be enforced worldwide and that commercial audio content will either be produced according to these standards and existing audio content will be adapted to comply with same.
The change in the leveling paradigm from program peak normalization to program loudness normalization affects all stages of an audio broadcast signal, from production to distribution and transmission. The ultimate goal is to harmonize audio loudness levels to achieve an equal universal loudness level for the benefit of the listener. Loudness level need not be at all times constant and uniform within a program. Program loudness normalization shall ensure that the average loudness of the whole program is the same for all programs. Loudness normalization achieves equal average loudness with the peaks varying depending on the content as well as on the artistic and technical needs. This does not mean that within a program the loudness level must be constant. Nor does it mean that individual components of a program all have to be at the same loudness level. Instead, the average, integrated loudness of the whole program is normalized.
Because of the required constant loudness, the typical reason for dynamics compression no longer applies. To control the dynamics scope in EBU R128, dynamics are specified in parallel with the loudness value via the measure Loudness Range, which is abbreviated as “LRA”. Loudness Range measures the variation of loudness on a macroscopic scale in units of “LU”—Loudness Units. Loudness Range quantifies the variation in time-varying loudness measurement and is supplementary to the main audio measure Program Loudness of ESU R128. The computation of Loudness Range is based on a measurement of loudness level as specified in ITU-R BS. 1770. The measure Loudness Range is used to help decide if and how much dynamic compression is needed (dependent on genre, target audience and transmission platform). In discussions about loudness, often the impression is created that the greatest possible dynamics automatically leads to better audio quality. However the dynamics is heavily dependent on the listening situation and environment.
The computation of Loudness Range is based on the statistical distribution of measured program loudness, so that short, but very loud, events will not affect the Loudness Range of a longer program. The range of the distribution of loudness levels is determined by estimating the difference between a low and a high percentile of the loudness distribution. Loudness Range additionally employs a cascaded gating method to take into account types of programs that may be, overall, very consistent in loudness, but that have some sections with very low loudness. Without gating, such programs would incorrectly get quite a high Loudness Range measurement, due to the relatively large difference in loudness between the regions of background and those of normal foreground loudness.
The situations and environments in which audio is being listened to can vary widely. For example, consider the differences in listening environments such as the movie theatre, the home theatre, the living room, the kitchen, late at night at home, walking along a street, in the car and in an airplane. As can be readily seen by reference to the foregoing, the locations that involve the most challenging listening environments (e.g., via mp3 player, in a car, train, airplane, stores, etc.) are also the locations where audio tends to be regularly consumed. This is reasonable because in these sorts of locations often the user is occupied with visual tasks (e.g., driving) so the use of audio alone may be preferred. Therefore, in such situations it would be important to adapt audio content to the lower dynamic complexity in these environments. On the other hand it is not desirable to completely eliminate dynamics and the associated higher sound quality in quieter environments.
The EBU is proposing the use of a compressor (EBU R128 3343) to limit dynamics. A compressor in general, however, has the disadvantage that its time constants are not definable independent from the content of the program. Furthermore, its functionality can only be utilized in a delayed fashion in systems with low latency (e.g. radio). The time constants may need to be relatively long (e.g., in the case of a leveler rather than a compressor). Both approaches can lead to a result which in itself is not optimal, because artifacts created by the compressor are noticeable to the listener.
Furthermore, the Dolby AC-3 codec includes compression presets that cause the encoder to generate different gain control words that are sent in the bitstream to the consumer's decoder: e.g., Film Standard, Film Light, Music Standard, Music Light, Speech and None. The transmission of gain-words is applied to reduce the dynamic range of the signal either by default or after user activation. This approach provides generic dynamic range compression curves that are to be applied to individual audio programs, without consideration of the audio program itself and its listening environment.
Thus, what is needed is a method of adjusting the loudness range of an audio program in response to environmental/background noise level changes that provides less “pumping”, offline processing, an option to adapt to the surrounding noise levels, scalability, an approach that is readily and easily implementable, and an approach that is directly tuned to the particulars of the audio content of the program that is being heard.
Heretofore, as is well known in the media editing industry, there has been a need for an invention to address and solve the above-described problems. Accordingly it should now be recognized, as was recognized by the present inventors, that there exists, and has existed for some time, a very real need for a system and method that would address and solve the above-described problems.
Before proceeding to a description of the present invention, however, it should be noted and remembered that the description of the invention which follows, together with the accompanying drawings, should not be construed as limiting the invention to the examples (or preferred embodiments) shown and described. This is so because those skilled in the art to which the invention pertains will be able to devise other forms of the invention within the ambit of the appended claims.