1. Field of the Invention
The present invention relates to a received voice processing apparatus. More particularly, the present invention relates to a received voice processing apparatus for clarifying received voice in a cellular phone.
2. Description of the Related Art
In recent years, cellular phones become widespread. FIG. 1 is a block diagram of an example of a receiving part of a conventional cellular phone. A signal received by an antenna 10 is tuned by a RF transmit/receive part 12. After that, a baseband signal processing part 14 converts the signal into a baseband signal. Then, a voice decoding part 16 decodes the signal into a receive voice signal, and the amplifier 18 amplifies the signal so that voice is reproduced from a speaker 20.
As the voice decoder 16, a device that efficiently compresses and decompresses a voice signal by using digital signal processing can be used. For example, a decoder of CS-ACELP (Conjugate Structure-Algebraic CELP) can be used. Or, decoder of VSELP (Vector Sum Excited Linear Prediction), ADPCM decoder, PCM decoder and the like can be used.
The cellular phone is often used in the outside. Thus, there are many cases in which received voice can not be heard well when the level of surrounding noise such as traffic noise is high. This phenomenon occurs due to a masking effect by the surrounding noise. That is, low voice can not be heard well and clearness of voice decreases due to the masking effect.
In the voice sending side, a noise canceler is implemented for removing the surrounding noise. However, as for the received voice, any effective measure is not taken. Thus, a user of the cellular phone can not hear well the voice of the party on the other end of the cellular phone under a noisy environment. Conventionally, for hearing the voice well, the user adjusts the volume of the received voice.
Some methods have been contrived for automatically adjusting the received voice according to surrounding noise, in which it is not necessary for the user to change the volume of the received voice. For example, Japanese laid-open patent application No. 9-130453 discloses a method for adjusting the volume of the received voice according to surrounding voice, in which a method on speed of increasing or decreasing the volume of the voice is disclosed.
In a method disclosed in Japanese laid-open patent application No. 8-163227, to prevent that the level of voice is erroneously measured due to voice input from the microphone, a means for discriminating between voice and non-voice is provided, so that accuracy of level measurement is increased. However, only the volume of the received voice adjusted in this method, in which frequency characteristics of voice are not considered.
In Japanese laid-open patent applications No. 5-284200 and No. 8-265075, tone of received voice is changed according to surrounding voice, and, range of voice that is reproduced is adjusted. In addition, in Japanese laid-open patent application No. 2000-349893, masking amount of voice is calculated from surrounding noise, then, a voice emphasizing process is performed.
However, there are following problems for the above-mentioned methods.
As for the Japanese laid-open patent applications No. 9-130453 and No. 8-163227 in which only automatic adjustment of the volume of the received voice is performed, it is predicted that distortion occurs when the voice is largely amplified, which causes user discomfort. In addition, clearness is not improved to a sufficient degree.
As for the Japanese laid-open patent applications No. 5-284200 and No. 8-265075 in which tone is changed and voice range is restricted, since, voice quality is changed, the user may feel something wrong. Thus, clearness is not improved to a sufficient degree.
The Japanese laid-open patent application No. 2000-349893 deals with voice recorded in a recording medium, and does not deal with real time processing. In addition, since the voice emphasizing processing is conventional band division type dynamic range compression processing, there is a problem accompanied by band division. That is, different compression presses is performed on each band of the voice signal, and the compressed voice signal is expanded and synthesized. Thus, the user may feel something wrong due to discontinuity between bands.