Audiovisual entertainment has evolved into a fast-paced sequence of dialog, narrative, music, and effects. The high realism achievable with modern entertainment audio technologies and production methods has encouraged the use of conversational speaking styles on television that differ substantially from the clearly-annunciated stage-like presentation of the past. This situation poses a problem not only for the growing population of elderly viewers who, faced with diminished sensory and language processing abilities, must strain to follow the programming but also for persons with normal hearing, for example, when listening at low acoustic levels.
How well speech is understood depends on several factors. Examples are the care of speech production (clear or conversational speech), the speaking rate, and the audibility of the speech. Spoken language is remarkably robust and can be understood under less than ideal conditions. For example, hearing-impaired listeners typically can follow clear speech even when they cannot hear parts of the speech due to diminished hearing acuity. However, as the speaking rate increases and speech production becomes less accurate, listening and comprehending require increasing effort, particularly if parts of the speech spectrum are inaudible.
Because television audiences can do nothing to affect the clarity of the broadcast speech, hearing-impaired listeners may try to compensate for inadequate audibility by increasing the listening volume. Aside from being objectionable to normal-hearing people in the same room or to neighbors, this approach is only partially effective. This is so because most hearing losses are non-uniform across frequency; they affect high frequencies more than low- and mid-frequencies. For example, a typical 70-year-old male's ability to hear sounds at 6 kHz is about 50 dB worse than that of a young person, but at frequencies below 1 kHz the older person's hearing disadvantage is less than 10 dB (ISO 7029, Acoustics—Statistical distribution of hearing thresholds as a function of age). Increasing the volume makes low- and mid-frequency sounds louder without significantly increasing their contribution to intelligibility because for those frequencies audibility is already adequate. Increasing the volume also does little to overcome the significant hearing loss at high frequencies. A more appropriate correction is a tone control, such as that provided by a graphic equalizer.
Although a better option than simply increasing the volume control, a tone control is still insufficient for most hearing losses. The large high-frequency gain required to make soft passages audible to the hearing-impaired listener is likely to be uncomfortably loud during high-level passages and may even overload the audio reproduction chain. A better solution is to amplify depending on the level of the signal, providing larger gains to low-level signal portions and smaller gains (or no gain at all) to high-level portions. Such systems, known as automatic gain controls (AGC) or dynamic range compressors (DRC) are used in hearing aids and their use to improve intelligibility for the hearing impaired in telecommunication systems has been proposed (e.g., U.S. Pat. No. 5,388,185, U.S. Pat. No. 5,539,806, and U.S. Pat. No. 6,061,431).
Because hearing loss generally develops gradually, most listeners with hearing difficulties have grown accustomed to their losses. As a result, they often object to the sound quality of entertainment audio when it is processed to compensate for their hearing impairment. Hearing-impaired audiences are more likely to accept the sound quality of compensated audio when it provides a tangible benefit to them, such as when it increases the intelligibility of dialog and narrative or reduces the mental effort required for comprehension. Therefore it is advantageous to limit the application of hearing loss compensation to those parts of the audio program that are dominated by speech. Doing so optimizes the tradeoff between potentially objectionable sound quality modifications of music and ambient sounds on one hand and the desirable intelligibility benefits on the other.