Programs, such as those intended for television broadcast are, in many cases, intentionally produced with variable loudness and wide dynamic range to convey emotion or a level of excitement in a given scene. For example, a movie may include a scene with the subtle chirping of a cricket and another scene with the blasting sound of a shooting cannon. Interstitial material such as commercial advertisements, on the other hand, is very often intended to convey a coherent message, and is, thus, often produced at a constant loudness, narrow dynamic range, or both. Annoying loudness disturbances commonly occur at the point of transition between the programming and the interstitial material. Thus the problem is commonly known as the “loud commercial problem.” Loudness annoyances, however, are not limited to the programming/interstitial material transition, but are pervasive within the programming and the interstitial material themselves.
Intelligibility issues arise when a component of the audio that is important for comprehension of the programming, also known as an anchor, is made inaudible or is overpowered by another component of the audio. Dialog is arguably the most common program anchor. An example is the broadcast of a tennis match on TV. A commentator narrates the action on the court while at the same time noise from the crowd and the competitors may be heard. If the crowd noise overpowers the narrator's voice, that part of the program, the narrator's voice, may be rendered unintelligible.
Processes addressing the loud commercial problem and intelligibility issues generally attempt to measure loudness and use this measurement to adjust audio signals accordingly to improve loudness and intelligibility. Conventional techniques for measuring loudness, however, may be unsatisfactory.
One technique for measuring loudness disclosed in U.S. Pat. No. 7,454,331 to Vinton et al., which is incorporated by reference herein in its entirety, measures the speech component of the audio exclusively to determine program loudness. This technique, however, may provide insufficient loudness measurement for programming that includes only minimal speech components. For programming that includes no speech components at all, loudness may remain unmeasured and thus unimproved.
Another conventional technique, in essence, measures loudness by measuring whatever component of the audio is the loudest for the longest period of time. This technique, however, may provide measurements that deviate from the intent of the programming or from human perception of loudness. This may be particularly true for programming that has wide dynamic range. For example, this technique may erroneously judge the loudness of a scene which contains the roaring sound of a jet flying overhead as too loud. This measurement may result in processing or adjustment of the audio program that, for example, may lower speech components of the audio to unintelligible levels.