1. Technical Field of the Invention
The present invention relates in general to the mobile communications field and, in particular, to a method and system for improving voice signals in a mobile station receiver.
2. Description of Related Art
In a cellular telephone communications link, voice signals are compressed in order to improve the intelligibility of the signals being received in the presence of high local background noise. In conventional analog cellular communications systems, such as the Advanced Mobile Phone System (AMPS), compression is used for speech transmissions over the radio link. In conventional digital cellular communications systems, such as the Digital-AMPS (D-AMPS), compression is not used for speech transmissions over the radio link.
For example, FIG. 1 is a simplified block diagram that illustrates speech compression in a conventional analog cellular communications system. For example, the system shown in FIG. 1 can represent a base transceiver station (BTS) and mobile station in the AMPS. As shown, on the transmitter side 10 of the BTS, a 2:1 compressor 12 (referred to as a compander in AMPS) produces a 1 dB increase at its output for each 2 dB increase of the speech signal at its input. As such, speech companding on the transmitter side limits the frequency deviation of the transmitted carrier, constrains the energy to a finite channel bandwidth, and creates a quieting effect during a speech burst. On the receiver side 20 of the mobile station, an expander 26 decompresses the received signal and the input speech signal is restored with a minimum of distortion.
A significant problem addressed by the present invention is related to the use of speech compression in communications systems, and the intelligibility of received speech signals in noisy environments. For example, cellular radio mobile stations are commonly operated in relatively high ambient noise level environments. Typical operating environments can include the interior of a moving automobile or a congested sidewalk along a busy thoroughfare. The ambient noise levels experienced in such environments easily range from 60 to 80 dB SPL (sound pressure level referenced to 20e.sup.-6 Pascals). Such noise levels can mask the desired speech content in the transmitted signals. As such, telephone usability studies disclose that people consider ambient noise levels of 60 dB SPL and higher to cause a substantial amount of speech interference. For the same people, ambient noise levels of 80 dB SPL and higher made telephone usage impossible. By compressing the dynamic range of speech signals in a mobile station's receiver, all useful speech information can be maintained above a much higher noise floor without exceeding maximum usable limits.
Typically, as noise levels increase, the volume of the received speech signals can be increased to maintain the signal intelligence above the noise interference level. With the high ambient noise levels frequently encountered in the operating environments of mobile stations, the conventional solution has been to increase the average or nominal receive speech levels of the mobile stations accordingly. However, there are significant limitations with this approach. For example, distortion due to limiting is introduced into the receive signals at levels of about 120 dB SPL. This non-linearity sounds quite unpleasant to a user, and it also diminishes the effectiveness of certain signal processing techniques that utilize linear signals, such as, for example, echo cancellation. Additionally, there is a 95 dB SPL hearing overload limit across the frequency spectrum. This overload limit is the sound level above which (at each frequency) a listener's hearing will no longer respond to any increase.
In a specific frequency band, a speech signal has a 30 dB dynamic range within the full intelligible speech range. In this 30 dB dynamic range, the signal peaks are about 12 dB above the average level with information to 18 dB below the average level. Considering the frequency content of speech, the dynamic range of speech with full intelligibility is about 45 dB. The problem with increasing the receive volume as the ambient noise levels are increased is that a point can be reached where the speech intelligibility is diminished due to losses in the dynamic range at the upper end. Assuming (for simplicity) that the frequency content of speech and noise is flat, the ability to maintain maximum intelligibility begins to decrease at an ambient noise level of about 65 dB SPL. As such, a 95 dB SPL overload point minus the 30 dB speech dynamic range equals the resulting speech noise interference floor of 65 dB SPL.
Speech content in a signal that is useful for intelligibility purposes is the speech information maintained between the noise floor and the overload point. The amount of useful speech information can be approximated by using the arithmetic average of the speech energy in the three octave bands of 600-1200, 1200-2400, and 2400-4800 Hz. The speech articulation index is a percentage of the useful speech content with respect to the total possible speech information that is of importance for intelligibility. The speech articulation index can be approximated by dividing the useful speech content (dB) by 30 dB, and multiplying by 100. If the ambient noise level is low enough to allow all speech information to be below the 95 dB SPL overload point, the articulation index is 100%. However, if (because of a high ambient noise level) the useful speech content is contained in a 15 dB range, the articulation index is 15 dB/30 dB*100, which is 50%. For proper intelligibility of a speech signal, the articulation index should be at least 30% and preferably above 60%. Using the speech frequency spectrum for an average male user (see FIG. 2), these percentages correspond to approximately 86 dB SPL and 75 dB SPL noise levels, respectively. At a 65 dB SPL noise level, which is just about where speech interference has been reported to begin, the best expected articulation index is approximately 84%.
Sound engineers have long confronted the problem of limited dynamic range for speech. The broadcasting, sound reinforcement, and recording environments all have limited usable dynamic ranges when compared to the source content. A great deal of sound information would be lost if this limited dynamic range problem were to be left unresolved. The conventional solution of sound engineers to this problem is to compress the audio signal. This compression allows the resulting wide dynamic range signal to fit within the more limited dynamic range of the transmission or storage medium being used.