1. Field of the Invention
The present invention relates to a vocoder, and more particularly, to a pitch estimation method in a multiband excitation (MBE) vocoder.
2. Description of the Related Art
A vocoder, or voice encoder, is a device of compressing a voice signal in a communications network. Therefore, speech quality is considerably affected by the performance of the voice encoder.
The speech quality is determined by two elements. One is the restored tone quality of the voice encoder and the other is a delay time for restoring the tone quality. In particular, when the delay time for restoring the tone quality is long, speech is not smooth due to generation of echos. Therefore, a low-delay tone quality restoration is required in the voice encoder.
Recently, the MBE method is widely used as a voice encoder of a low transmission rate (in general, 1 through 4 kbit/s). The MBE method is widely known to reproduce high tone quality at a low transmission rate. However, with the exception of satellite communications, due to a long delay time it is difficult to use the MBE method for a terrestrial cellular network. The pitch estimation process causes the delay time to be long in the MBE method.
In general, in the process of estimating the pitch of the voice signal, two kinds of errors, i.e., a gross pitch error and a fine pitch error are considered. The gross pitch error is generated when the difference between an original pitch and an estimated pitch is considerably large. Such is the case when the estimated pitch doubles the original pitch (pitch doubling) or halves the original pitch (pitch halving). The fine pitch error is generated due to the restriction in the resolution.
In the conventional MBE vocoder, the problem with respect to the fine pitch error is solved by searching a fractional pitch by spectral analysis-by-synthesis.
According to the pitch estimation method according to spectral analysis-by-synthesis, the estimated pitch T* can be obtained by minimizing the error amount .zeta.(T) with respect to a given magnitude spectrum .vertline.S(.omega.).vertline.. ##EQU1## EQU T*=arg min {.zeta.(T)} [EQUATION 2]
wherein, .vertline.S(.omega.,T).vertline. and B(T) are the magnitude spectrum synthesized from the respective pitch candidates T in a predetermined pitch area and a biasing value of the error amount, respectively.
According to spectral analysis-by-synthesis, a correct pitch estimation can be performed as shown in FIG. 2B with respect to an input voice having a long pitch section as shown in FIG. 2A (the circled portion indicates the position of the estimated pitch). However, as shown in FIG. 3B, it is difficult to correctly estimate the pitch of a voice having a short pitch section and a considerably high period, as shown in FIG. 3A, since errors are similar in the integer multiples of the pitch. Therefore, pitch estimation by conventional spectral analysis-by-synthesis is very likely to cause the gross pitch error and to deteriorate the quality of the restored voice.
In order to overcome this problem, a pitch tracking method is used in the MBE vocoder employing conventional spectral analysis-by-synthesis. However, since the pitch tracking method requires a long look ahead (in general, 80 ms), it is difficult to use the conventional MBE vocoder as the low-delay encoder.