1. Field of the Invention
The present invention generally relates to communication systems used to transmit speech signals. More particularly, the present invention relates to methods for enhancing the intelligibility of speech signals received over a communication network from a far-end telephony terminal for playback at a near-end telephony terminal.
2. Background
Various background concepts will now be discussed in reference to an example conventional communication system 100 shown in FIG. 1. Communication system 100 includes a first telephony terminal 102 and a second telephony terminal 104 that are communicatively connected to each other via one or more communication network(s) 106. For the purposes of this example, first telephony terminal 102 will be referred to as the “near end” of the network connection and second telephony terminal 104 will be referred to as the “far end” of the network connection. Each telephony terminal may comprise a telephony device, such as a corded telephone, cordless telephone, cellular telephone or Bluetooth® headset.
First telephony terminal 102 operates in a well-known manner to pick up speech signals representing the voice of a near-end user 108 via a microphone 114 and to transmit such speech signals over network(s) 106 to second telephony terminal 104. Second telephony terminal 104 operates in a well-known manner to play back the received speech signals to a far-end user 110 via a loudspeaker 118. Conversely, second telephony terminal 104 operates in a well-known manner to pick up speech signals representing the voice of far-end user 110 via a microphone 116 and to transmit such speech signals over network(s) 106 to first telephony terminal 102. First telephony terminal 102 operates in a well-known manner to play back the received speech signals to near-end user 108 via a loudspeaker 112.
As further shown in FIG. 1, near-end user 108 is using first telephony terminal 102 in an environment that is subject to acoustic background noise. When this acoustic background noise becomes too loud, near-end user 108 may find the voice of far-end user 110 difficult to understand. This is because such loud acoustic background noise will tend to mask or drown out the voice of far-end user 110 that is being played back through loudspeaker 112 of first telephony terminal 102. When this occurs, the natural response of near-end user 108 may be to adjust the volume of loudspeaker 112 (assuming that first telephony terminal 102 includes a volume control button or some other volume control means) so that the volume of the voice of far-end user 110 is increased. However, it is inconvenient for near-end user 108 to have to manually adjust the volume in this manner; it would be far more convenient if first telephony terminal 102 could automatically adjust the volume to the appropriate level in response to an increase in acoustic background noise.
Furthermore, although near-end user 108 may increase the volume of loudspeaker 112, there is typically a limit on how much amplification can be applied to the speech signal received from far-end user 110 before that signal is subject to digital saturation or clipping. Additionally, even when the speech signal received from far-end user 110 has been amplified to a level immediately below which clipping occurs or to a level at which slight clipping occurs, the speech signal may still not be loud enough to be intelligible over the acoustic background noise.
Various techniques have been described in the literature that can be used to increase the loudness of a speech signal subject to a magnitude limit (such as amplitude compression) or to make the speech signal more intelligible. However, many of these techniques distort the speech signal.
What is needed, therefore, is a speech intelligibility enhancement (SIE) system and method that improves the intelligibility of a speech signal received over a communication network from a far-end telephony terminal for playback at a near-end telephony terminal when the near-end terminal is located in an environment with loud acoustic background noise. The desired SIE system and method should function automatically without any user input and also achieve improved intelligibility while minimizing distortion to the received speech signal.