The present invention generally relates to digital cellular communication systems, and more particularly, to a method and apparatus for determining the excitation signal in vector sum excited linear prediction (VSELP) coders used in such systems.
The present invention addresses the code search process that is the heart of all voice coders based upon CELP (code excited linear prediction) processing, and in particular a subgroup of the CELP coder known as a VSELP (vector sum excited linear prediction) coder. The voice coder selected recently as the standard for the digital cellular telecommunication (IS-54) specification is based upon this VSELP process. The IS-54 standard is officially known as the EIA/TIA Interim Standard, "Cellular System Dual-Mode Mobile Station--Base Station Compatibility Standard," published by the Electronic Industries Association.
The only known search method employing VSELP coding is based upon a Motorola code search routine as is stated in the IS-54 standard for the dual mode digital cellular communication system specification. The disadvantage of this method is its extensive computation time, which requires a fast, relatively expensive processor to implement.
The computation power needed to implement a conventional coder is about 25 Mips for the transmitter. This is mainly due to the conventional code search process that takes up about 47% of the computational time. The main goal in this search is to derive a signal that is a linear combination of a set of basis signals. In order to find the optimal weighting of the basis signals, the conventional search process scans all the possible weightings and a linear combination of weightings satisfying a certain criteria is selected.
More particularly, speech is modeled as an output of a periodic signal (pitch) that excites a cascade of filters that shape the spectrum. This model is the basis of the coding algorithm. It consists of three analysis stages: in the first, a model of the current speech frame is derived. This model is based upon the common linear prediction method, wherein a set of parameters is derived to minimize the error between the model and the signal. The first stage is followed by a second analysis procedure wherein the pitch period (or lag) is estimated. A residual signal, which is the error between the model and the real signal is then derived. The residual signal serves as an input to the third stage, wherein an analysis by synthesis approach is used to select, from a given codebook of residuals, the best one that matches that residual signal. The index of the selected residual is then transmitted along with the linear prediction parameters and the pitch lag. Since both the transmitter and receiver use an identical codebook, the residual is reconstructed, exciting a cascade of synthesis filters whose paramters are the linear prediction coefficients. The output of the filters is the reconstructed speech.
The standard approach assumes that all possible excitation signals (residuals) are derived by combining two signals f.sub.1 (n) and f.sub.2 (n). Each one of these signals is comprised of a linear combination of 7 basis signals, where the coefficients of the linear combination are constrained to be +1 or -1. The two signals excite the synthesis filters resulting an output voice which is hopefully a best replica of the original voice signal. By saying "best" what is meant is that no audible degradation is noticed. This is accomplished by weighting the error to be minimized with a weighting filter w(z) that takes into account the perceptual mechanism of hearing. Assuming a subframe of N samples long the general form of the error to be minimized in order to find f.sub.1 (n) and f.sub.2 (n) is: ##EQU1## and the signals q.sub.m (n) are the basis signals V.sub.m (n) and .gamma. is a gain factor. In addition, the signals are decorrelated. In every subframe, the optimization of the equation for E is done twice since two sets of basis signals are selected. Consequently, two sets of basis signals are convolved (each set consists of 7 signals, 40 samples long) with a recursive filter h(n) having length 10. This imposes a heavy load on the processor.
In order to find the optimal signal f.sub.I (n) all combinations of .theta..sub.m (2.sup.7 combinations) are computed and the best one is found. Since, for each word of 7 bits there is an optimal gain term .gamma. as well, the resulting search procedure requires additional computational resources.
The main goal in this search is to derive a signal that is a linear combination of a set of basis signals. In order to find the optimal weighting of the basis signals, the conventional search process scans all the possible weightings and a linear combination of weightings satisfying a certain criteria is selected.
Therefore, it is an objective of the present invention to provide a processing apparatus and method which reduces the complexity of conventional VSELP coders while maintaining voice quality, and thus improves the processing performance of such VSELP coders.