1. Field of the Invention
This invention relates to a method and apparatus for reproducing speech signals at a controlled speed, and to a method and apparatus for decoding the speech and method and apparatus for synthesizing the speech whereby pitch conversion can be realized by a simplified structure. The present invention also relates to a portable radio terminal device for transmitting and receiving pitch-converted speech signals.
2. Description of the Related Art
There are a variety of encoding methods for encoding an audio signal (including speech and other acoustic signals) for compression by exploiting statistical properties of the signals in the time domain and in the frequency domain, as well as the psychoacoustic characteristics of the human ear. The encoding methods may roughly be classified into time-domain encoding, frequency domain encoding, and analysis/synthesis encoding.
Examples of the high-efficiency encoding of speech signals include sinusoidal analysis encoding, such as harmonic encoding, multi-band excitation (MBE) encoding, sub-band coding (SBC), linear predictive coding (LPC), discrete cosine transform (DCT), modified DCT (MDCT), and fast Fourier transform (FFT).
High-efficiency speech encoding by time-axis processing using, for example, by code excited linear prediction (CELP) encoding, however, involves difficulties in real-time conversation. Voluminous processing operations must be performed to decode the signal for output. Moreover, since speed control is performed in the time domain subsequent to decoding, that method cannot be used for bit rate conversion.
Also, many applications require that decoded speech signals be varied only in pitch, while the phoneme of the signal remains unchanged. With the usual speech decoding methods, the decoded speech has to be pitch-converted using pitch control, thus complicating the design of the encoding/decoding means and raising their cost.