1. Field of the Invention
The present invention relates to a speech signal processing apparatus and a feature extracting circuit used for the same for improving the intelligibility of a speech signal.
2. Description of the Related Art
FIG. 9 shows a basic configuration of a conventional speech signal processing apparatus. The speech signal processing apparatus includes an amplifier 101 for amplifying a speech signal, a gap detector 102 for detecting a silence component, an envelope follower 103 for following an envelope of the speech signal, a zero crossing detector 104 for determining the zero crossing frequency of the speech signal, and a differentiator 105 for determining the rate of change in the speech signal. The speech signal processing apparatus further includes a one-shot mono/multivibrator 105 which generates a pulse on the basis of the outputs from the gap detector 102, the differentiator 105, and the zero crossing detector 104 so as to control the amplifier 101.
The operation of such a conventional speech signal processing apparatus will be described with reference to FIGS. 10A to 10C. FIG. 10A is a waveform of an input speech signal. The input speech signal is sent to the amplifier 101, the gap detector 102, the envelope follower 103, and the zero crossing detector 104. The gap detector 102 detects a silence component of the received speech signal and outputs the result to the one-shot mono/multivibrator 106. The envelope follower 103 follows an envelope of the received speech signal and outputs the result to the differentiator 105. The differentiator 105 determines the rate of change in the envelope and outputs the result to the one-shot mono/multivibrator 106. The zero crossing detector 104 determines the zero crossing frequency of the received speech signal and outputs the result to the one-shot mono/multivibrator 106. Based on the outputs from the gap detector 102, the differentiator 105, and the zero crossing detector 104, the one-shot mono/multi vibrator 106 generates a pulse having a waveform as shown in FIG. 10B. The pulse is generated when a silence component of the speech signal shifts to a sound component thereof and lasts until both the zero crossing frequency and the rate of change in the envelope become sufficiently high. The pulse generated by the one-shot mono/multivibrator 106 is sent to the amplifier 101, On receipt of the pulse, the amplifier 101 amplifies the input speech signal with a predetermined amount of gain, and outputs an amplified speech signal having a waveform as shown in FIG. 10C. When no pulse is sent to the amplifier 101, the original speech signal input to the amplifier 101 is output therefrom with a gain of 1, i.e., without any amplification.
Such a conventional speech signal processing apparatus can detect fricatives, but the detection of consonants with a short burst and a small amplitude such as plosives is difficult. Further, plosives have their own VOTs (voice onset time) which are different from one another. Such VOTs can not be detected by conventional speech signal processing apparatus. As a result, it is not possible for the amplifier 101 to amplify each consonant for its specific duration by correctly controlling the amplification time during which the consonant is amplified corresponding to the duration of the consonant. Furthermore, when a fricative is only partially amplified, a different sound from the original may be produced.