The present invention relates to a speech signal coding and/or decoding system and, more particularly, to a speech signal coding and/or decoding system using a pattern matching based on LSP (i.e., Line Spectrum Pair) parameters.
In the coded transmission of speech signals, reducing the transmission data bit rate is an important factor in making effective use of transmission lines. A system, in which speech signals are transmitted while being separated into segments of spectral and excitation source information so that the original speech is reproducible on the basis of those segments of information, is frequently used to lower the bit rate of transmission. In a vocoder, for example, LPC, LSP and PARCOR coefficients are adopted as the spectral information of the speech signals whereas voiced/unvoiced discrimination, pitch and residual information are adopted as excitation source information. According to the vocoder, the transmission bit rate of the speech signal can go as low as 4.8 kb/sec, but the reproduced sound quality is not always satisfactory. Essentially, this is because the vocoder does not code the input speech waveform. In order to improve the reproduced speech quality, there has been proposed a multi-pulse type speech signal coding technique which codes and transmits the position and amplitude of a plurality of pulses as speech waveform information. The multi-pulse type speech signal coding technique is disclosed, for example, in B. S. Atal et al., "A New Model of LPC Excitation for Producing Natural Sounding Speech at Low Bit Rates", Proc. ICASSP 82, pp. 614-617 (1982) or in United States Patent Application Ser. No. 565,804, filed Dec. 27, 1983, by Kazunori Ozawa et al. for assignment to the present assignee.
According to the coding technique described above, although the reproduced speech quality is improved, the bit rates required for coding the multi-pulses usually are as high as 9.6 Kb/sec.
The pattern matching method has been proposed so as to make possible a drastic reduction in the data bit rates and to improve the reproduced speech quality. In this pattern matching method, each of multiple kinds of reference spectral envelope information (i.e. the reference pattern) prepared in advance is labeled, and pattern matching between spectral information (i.e., the input pattern) obtained by analyzing an input speech signal and the reference pattern is conducted to develop the distance between the two so that the label of the reference pattern, which is closest to (or at the minimum distance from) the input pattern, is coded and transmitted.
If the pattern matching system described above is used, the number of bits required for transmitting spectral information can be drastically reduced. Despite this fact, however, the pattern matching system has the following problems.
In this pattern matching system, more specifically, the principal parameters to be used as spectral information are the LSP parameters having relatively little pattern matching distortion, and the distance between the LSP parameter pattern of the input speech (i.e., the input pattern) and the reference pattern is computed according to an approximate equation using spectral sensitivity (which is defined as the distortion of the spectral envelope when minute changes are independently given to the respective elements of the LSP parameters) of the LSP parameters. It has been experimentally confirmed that the smaller the frequency interval .DELTA..omega. between the respective elements of the LSP parameters becomes, the more inaccurate the spectral sensitivity value becomes. In other words, for the smaller interval .DELTA..omega., the minute changes in the respective elements of the LSP parameters greatly influence the overall spectrum envelope properties, thereby making it difficult to match patterns precisely. Accordingly, this problem is quite evident because the LSP frequency interval .DELTA..omega. obtained by the LSP analysis has a higher occurrence rate for a smaller value than for a larger value.