A fundamental issue in the wireless transmission of digitised speech signals is the minimisation of the bit-rate required to transmit an individual speech signal. By minimising the bit-rate, the number of communications which can be carried by a transmission channel, for a given channel bandwidth, is increased. All of the recognised standards for digital cellular telephony therefore specify some kind of speech codec to compress speech data to a greater or lesser extent. More particularly, these speech codecs rely upon the removal of redundant information present in the speech signal being coded.
In Europe, the accepted standard for digital cellular telephony is known under the acronym GSM (Global System for Mobile communications). GSM includes the specification of a CELP speech encoder (Technical Specification GSM 06.60). A very general illustration of the structure of a CELP encoder is shown in FIG. 1. A sampled speech signal is divided into 20 ms frames, defined by a vector x(j), of 160 sample points, j=0 to 159. The frames are encoded in turn by first applying them to a linear predictive coder (LPC) 1 which generates for each frame x(j) a set of LPC coefficients a(i), i=0 to n, which are representative of the short term redundancy in the frame. In GSM, n is predefined as ten.
The output from the LPC comprises this set of LPC coefficients a(i) and a residual signal r(j) produced by removing the short term redundancy from the input speech frame using a LPC analysis filter. The residual signal is then provided to a long term predictor (LTP) 2 which generates a set of LTP parameters b which are representative of the long term redundancy in the residual signal. In practice, long term prediction is a two stage process, involving a first open loop estimate of the LTP coefficients and a second closed loop refinement of the estimated parameters.
An excitation codebook 3 is provided which contains a large number of excitation codes. For each frame, each of these codes is provided in turn, via a scaling unit 4, to a LTP synthesis filter 5. This filter 5 receives the LTP parameters from the LTP 2 and introduces into the code the long term redundancy predicted by the LTP parameters. The resulting frame is then provided to a LPC synthesis filter 6 which receives the LPC coefficients and introduces the predicted short term redundancy into the code. The predicted frame x.sub.pred (j) is compared with the actual frame x(j) at a comparator 7, to generate an error signal e(j) for the frame. The code c(j) which produces the smallest error signal, after processing by a weighting filter 8, is selected by a codebook search unit 9. A vector u(j) identifying the selected code is transmitted over the transmission channel 10 to the receiver. The LPC coefficients and the LTP parameters are also transmitted but, prior to transmission, they themselves are encoded to minimise still further the transmission bit-rate.
The LPC analysis filter (which removes redundancy from the input signal to provide the residual signal r(j)) is shown schematically in FIG. 2. The input code c(j) (as modified by the LTP synthesis filter) is combined with delayed versions of itself c(j-i), the LPC coefficients a(i) providing the gain factors for respective delayed versions and with a(O)=1. The filter can be defined by the expression: EQU A(z)=1+a(l)z.sup.-1 +. . .+a(n)z.sup.-n
where z represents a delay of one sample.
The LPC coefficients are converted into a corresponding number of line spectral pair (LSP) coefficients, which are the roots of the two polynomials given by: EQU P(z)=A(z)+z.sup.-(n+1) A(z.sup.-1)
and EQU Q(z)=A(z)-z.sup.-(n+1) A(z.sup.-1)
Typically, the LSP coefficients of the current frame are quantised using moving average (MA) predictive quantisation. This involves using a predetermined average set of LSP coefficients and subtracting this average set from the current frame LSP coefficients. The LSP coefficients of the preceding frame are multiplied by respective (previously determined) prediction factors to provide a set of predicted LSP coefficients. A set of residual LSP coefficients is then obtained by subtracting the mean removed LSP coefficients from the predicted LSP coefficients. The LSP coefficients tend to vary little from frame to frame, as compared to the LPC coefficients, and the resulting set of residual coefficients lend themselves well to subsequent quantisation (`Efficient Vector Quantisation of LPC Parameters at 24 Bits/Frame`, Kuldip K. P. and Bishnu S. A., IEEE Trans. Speech and Audio Processing, Vol 1, No 1, January 1993).
The number of LPC coefficients (and consequently the number of LSP coefficients), determines the accuracy of the LPC. However, for any given frame, there exists an optimal number of LPC coefficients which is a trade off between encoding accuracy and compression ratio. As already noted, in the current GSM standard, the order of the LPC is fixed at n=10, a number which is high enough to encode all expected speech frames with sufficient accuracy. Whilst this simplifies the LPC, reducing computational requirements, it does result in the `over-coding` of many frames which could be coded with fewer LPC coefficients than are specified by this fixed rate.
Variable rate LPC's have been proposed, where the number of LPC coefficients varies from frame to frame, being optimised individually for each frame. Variable rate LPCs are ideally suited to CDMA networks, the proposed GSM phase 2 standard, and the future third generation standard (UTMS). These networks use, or propose the use of, `packet switched` transmission to transfer data in packets (or bursts). This compares to the existing GSM standard which uses `circuit switched` transmission where a sequence of fixed length time frames are reserved on a given channel for the duration of a telephone call.
Despite the advantages, a number of technical problems must be overcome before a variable rate LPC can be satisfactorily implemented. In particular, and as has been recognised by the inventors of the invention to be described below, a variable rate LPC is incompatible with the LSP coefficient quantisation scheme described above. That is to say that it is not possible to directly generate a predictive, quantised LSP coefficient signal when the number of LSP coefficients is varying from frame to frame. Furthermore, it is not possible to interpolate LPC (or LSP) coefficients between frames in order to smooth the transition between frame boundaries.