This invention relates to a speech encoding method and a device therefor. The speech encoding method or technique is for encoding an input speech signal into an output encoded speech signal. The output encoded speech signal is either for transmission through a transmission channel or for storage in a storing medium.
This invention also relates to a method of decoding the output encoded speech signal into an output speech signal, namely, into a replica of the input speech signal, and to a decoder for use in carrying out the decoding method. The output encoded speech signal is supplied to the decoder as an input encoded speech signal and is decoded into the output speech signal by synthesis.
Speech encodings is well known as adaptive transform coding (ATC) in the art. The adaptive transform coding is, for example, described by N. S. Jayant et al. in a book of "DIGITAL CODING OF WAVEFORMS, Principle and Applications to Speech and Video", 1984, PRENTICE-HALL, INC. in U.S.A., pages 563-576 in Chapter 12 thereof, under the title of "12.7 Adaptive Transform Coding of Speech and Images". In the adaptive transform coding of speech, an input speech signal is partitioned or divided into data blocks by using a time window such as a rectangular window. Each of data blocks is decomposed into a plurality of frequency components by means of an orthogonal transformation such as Discrete Fourier Transform (DFT), Discrete Walsh Hadamard Transform (DWHT), Discrete Cosine Transform (DCT), Karhunen Loeve Transform (KLT), or the like. The frequency components are adaptively quantized or encoded on the basis of intensity of a spectral envelope of the data block in question with a quantization bit number (the number of quantum levels) selectively assigned to each frequency component.
On the other hand, on decoding the encoded speech signal, the encoded speech signal is converted into the frequency components. The frequency components are successively composed into the data blocks. And then, the data blocks are coupled to produce a replica of the input speech signal.
In this connection, a frequency component having relatively high intensity of the spectral envelope is assigned with the quantization bit number indicating a lot of bits while a frequency component having relatively low intensity of the spectral envelope is assigned with the quantization bit number indicating few bits. It is to be noted that each frequency component always has phase information as well as amplitude information in a conventional encoder. Under the circumstances, bit assignment is insufficiently made as regards the frequency component having relatively low intensity of the spectral envelope in a case where the encoder has a low encoding speed. As a result, on decoding the encoded speech signal encoded by the conventional encoder, a conventional decoder decodes the encoded speech signal into the replica of the input speech signal accompanied by the sense of unnatural hearing. Accordingly, it results in degradation of a speech quality.