1. Field of the Invention
This invention relates to a speech encoding method and apparatus in which an input speech signal is split on the time axis in terms of a pre-set block as an encoding unit and encoded from one such encoding unit to another. The invention also relates to a pitch detection method employing the speech encoding method and apparatus.
2. Description of the Related Art
Up to now, there are known a variety of encoding methods for performing signal compression by exploiting statistic properties in the time domain and frequency domain of audio signals, inclusive of speech and acoustic signals, and psychoacoustic properties of the human being. These encoding methods are roughly classified into encoding in the time domain, encoding in the frequency domain and analysis-synthesis encoding.
Among the techniques for high-efficiency encoding of speech signals, there are known sinusoidal analysis encoding, such as harmonic encoding or multi-band excitation (MBE) encoding, sub-band coding (SBC), linear predictive coding (LPC), discrete cosine transform (DCT), modified DCT (MDCT) and fast Fourier transform (FFT).
Meanwhile, in the sinusoidal synthetic encoding generating excitation signals using the pitch of an input speech signal as a parameter, pitch detection plays an important role. With a pitch detection method employing an autocorrelation method used in a conventional speech signal encoding circuit and which is seasoned with a fractional search with the sample shifting of not more than one sample for improving pitch detection accuracy, if the half-pitch or the double pitch exhibits stronger correlation than the pitch desired to be detected in the speech signal, these half-pitch or the double pitch tend to be detected by error. Also, since there is no significant pitch in the unvoiced portion of the speech signal, the results of pitch detection of the unvoiced portion occasionally leads to failure in pitch detection.