Linear predictive speech encoders which operate in accordance with the aforesaid multipulse principle are known to the art; see, for instance, U.S. Pat. No. 3,624,302 which describes linear predictive encoding of speech signals, and U.S. Pat. No. 3,740,476 which describes how predictive parameters and prediction residual signals can be formed in such a speech encoder.
When forming an artificial speech signal by means of linear predictive coding, there are generated from the original signal a plurality of predictive parameters (a.sub.k) which characterize the artificial speech signal. Thus, there can be formed from these parameters a speech signal which does not contain the redundancy that is normally included in natural speech and which it is unnecessary to convert in speech transmission between, e.g., a mobile and a base station in a mobile radio system. From the aspect of bandwidth, it is more suitable to transmit solely the predictive parameters instead of the original speech signal, which requires a much higher bandwidth.
However, the speech signal thus regenerated in a receiver and constituting a synthetic speech signal may be difficult to understand as a result of a lack of agreement between the speech pattern of the original signal and the synthetic signal regenerated by means of the predictive parameters. These deficiencies have been described in detail in U.S. Pat. No. 4,472,832 (SE-B-456618) and can be alleviated to some extent by introducing so-called excitation pulses (multipulses) when constructing the synthetic speech replica. This is effected by partitioning the original speech input pattern into frame intervals. There is formed within each such interval a determined number of pulses of varying amplitude and phase position (time position) in accordance with the predictive parameters a.sub.k and also in accordance with the prediction residual d.sub.k between the speech input pattern and the speech replica. Each of the pulses is able to influence the speech pattern replica such as to obtain the smallest possible prediction residual. The generated excitation pulses have a relatively low bit rate and can therefore be encoded and transmitted on a narrow band, similar to the predictive parameters. This improves the quality of the regenerated speech signal.
In the aforesaid known method, the excitation pulses are generated within each frame interval of the speech input pattern by weighting the residual signal d.sub.k and feeding back and weighting the generated values for the excitation pulses each in a predictive filter. A correlation is then effected between the output signals on the two filters and the correlation is maximized for a number of signal elements from the correlated signal, such as to form the parameters (amplitude and phase position) of the excitation pulses. The advantage afforded by this multipulse algorithm for generating the excitation pulses is that different types of sound can be generated with a small number of pulses (for instance eight pulses/frame interval). The pulse-searching algorithm is general with respect to the pulse positions within the frame. It is possible to regenerate unvoiced sounds (consonants), which generally require randomly placed pulses and voiced sounds (vowels) which require positioning of pulses to be more collected.
These known methods calculate the correct phase positions of the excitation pulses within a frame and subsequent frames of the speech signal and positioning of the pulses, so-called pulse placement, is effected solely in dependence on complex processing of the speech signal parameters (prediction residuals, residual signal and the excitation pulse parameters in preceding frames).
One drawback with the original pulse placement methods according to the aforesaid U.S. patent is that encoding, which is effected subsequent to calculating the pulse positions, is complex with regard to calculations and storage. The encoding also requires a large number of bits with each pulse position within the frame interval. Furthermore, the bits in the code words obtained from the optimal combinatory pulse encoding algorithms are sensitive to bit errors. A bit error in the code word during transmission from the transmitter to the receiver can have disastrous consequences with regard to pulse positioning when decoding in the receiver.
This can be alleviated by restricting the number of excitation pulses that need be set out in each speech frame. This is made possible by the fact that the number of pulse positions for the excitation pulses within a frame interval is so large as to enable precise positioning of one or more excitation pulses within the frame to be ignored while nevertheless obtaining a regenerated speech signal of acceptable quality after encoding and transmission.
Accordingly, there has been proposed a method (see U.S. Pat. No. 5,193,140) in which certain phase position limitations are introduced when setting out the pulses, by prohibiting a certain number of phase positions that have already been determined to those pulses which succeed the phase position of an already calculated excitation pulse. When the position of a first pulse in the frame has been calculated and placed in its calculated phase position, this phase position is denied to subsequent pulses in the frame. This rule will preferably apply to all pulse positions in the frame. When commencing the localization of pulses in a new following frame, all positions in the frame are free.
The use of so-called code books in speech encoders when generating the synthetic speech signal has been proposed in recent times; see, for instance, U.S. Pat. No. 4,701,954. This code book stores a number of speech signal code words that are used when creating the synthetic speech replica. The code book may be fixed, i.e. contain permanent code words, or may be adaptive, i.e. can be updated as the speech replica is formed. A combination of a fixed and an adaptive code book may also be used.