The present application is related to U.S. Pat. No. 5,920,832 issued Jul. 6, 1999 and to U.S. Pat. No. 6,014,619 issued Jan. 11, 2000.
The invention is related to a transmission system comprising a transmitter for transmitting an input signal to a receiver via a transmission channel, the transmitter comprising an encoder with an excitation sequence generator for generating a plurality of excitation sequences, selection means for selecting an excitation sequence from a plurality of excitation signals resulting in a minimum error between a synthetic signal derived from said excitation sequence, and a target signal derived from the input signal, the transmitter being arranged for transmitting a signal representing the selected excitation sequence to the receiver, the receiver comprises a decoder with an excitation sequence generator for deriving the selected excitation sequence from the signal representing the selected excitation sequence, and a synthesis filter for deriving a synthetic signal from the excitation sequence.
The present invention is also related to a transmitter, an encoder, a transmission method and an encoding method.
A transmission system according to the preamble is known from the paper xe2x80x9cCodebook searching for 4.8 kbps CELP speech coderxe2x80x9d by W. Grieder et. al. in Communications, Computers and Power in the Modern Environment Conference proceeding, Saskatoon, Canada, May 17-18, 1993, pp. 397-406, IEEE Wescanex 1993.
Such transmission systems can be used for transmission of speech signals via a transmission medium such as a radio channel, a coaxial cable or an optical fibre. Such transmission systems can also be used for recording of speech signals on a recording medium such as a magnetic tape or disc. Possible applications are automatic answering machines or dictating machines.
In modern speech transmission systems, the speech signals to be transmitted are often coded using the analysis by synthesis technique. In this technique, a synthetic signal is generated by means of a synthesis filter which is excited by a plurality of excitation sequences. The synthetic speech signal is determined for a plurality of excitation sequences, and an error signal representing the error between the synthetic signal, and a target signal derived from the input signal is determined. The excitation sequence resulting in the smallest error is selected and transmitted in coded form to the receiver.
In the receiver, the excitation sequence is recovered, and a synthetic signal is generated by applying the excitation sequence to a synthesis filter. This synthetic signal is a replica of the input signal of the transmitter.
In order to obtain a good quality of signal transmission a large number (e.g. 1024) of excitation sequences are involved with the selection. In the case of speech coding an excitation sequence is in general a segment with a duration of 2-5 ms. In the case of a sample frequency of 16 kHz, this means 32-80 samples. The parameters of the synthesis filter are in general derived from analysis parameters which represent characteristic properties of the input signal. In speech coding the analysis parameters used mostly are so called prediction parameters. The number of prediction parameters can vary from 10 to 50, and consequently the order of the synthesis filter.
Having to compute the synthetic speech signal for all excitation sequences results in a substantial computational burden.
The object of the present invention is to provide a transmission system according to the preamble in which the computational burden is substantially reduced.
Therefore the transmission system according to the invention is characterised in that the encoder comprises an analysis filter for deriving from the input signal a residual sequence, in that the encoder comprising excitation sequence selection means for selecting from a larger set of excitation sequences the plurality of excitation sequences having the largest resemblance with the residual sequence.
The invention is based on the recognition that the complexity of the transmission system can be substantially reduced by performing a preselection of the possible excitation sequences using a filtered target signal or residual signal. The excitation sequences selected are those that most resemble the filtered target signal (or residual signal). Experiments have shown that it is possible to reduce the complexity of the coder with a factor varying from 20 to 180 without affecting the quality of the selection procedure.
It is observed that the article xe2x80x9cBinary pulse excitation: a novel approach to low complexity CELP codingxe2x80x9d by R. A. Salami in the book xe2x80x9cAdvances in Speech Codingxe2x80x9d edited by B. Atal, V. Cupermann and A. Gersho, pp. 145-156, Kluwer Academic Publishers, ISBN 0-7923-9091-1 discloses the construction of a local codebook from a larger codebook. However in this document it is not disclosed that the excitation sequences are selected in view of their resemblance to the residual signal, but they are derived from one selected excitation sequence which is regarded as nearly optimal.
An embodiment of the invention is characterised in that the excitation sequences comprise non zero sample values being separated by a predetermined number of zero sample values, and in that the excitation sequence selecting means are arranged for determining from the residual signal the position of the non zero sample values in the plurality of excitation sequences.
Using equidistant pulses separated with a predetermined number of zero values results in a reduced computational complexity for filtering the excitation sequences. By first selecting the position of the non zero samples in the excitation sequences to be considered for further selection, the number of excitation sequences involved in the further selection, is reduced substantially. This leads to a substantial decrease of the required computational complexity.
A further embodiment of the invention is characterised in that the excitation sequences comprises ternary excitation samples, in that the excitation sequence selecting means are arranged for selecting the excitation sequences of which the sign of the signal samples does not differ from the sign of the corresponding samples in the residual sequence.
Using ternary sample values results in a low computational complexity, because the multiplications used in the filtering of a ternary signal involves only multiplications with +1, 0 or xe2x88x921, which can easily be performed.