An important aspect in wireless communications and cellular mobile radio is spectral efficiency, i.e., the user density of the allocated spectrum. Several factors play a role in determining the system's spectral efficiency, including cell size, method of multiple access, and modulation technique. As speech transmissions represent the most-used form of communications, the bit rate of the speech codec plays a significant role in determining the system's spectral efficiency. Therefore, the need for a low bit rate speech codec is of great importance, particularly when considering future generations of personal communications systems (PCS).
Selection of a speech codec for PCS is not a trivial task since most existing low bit rate speech coders are highly complex, requiring computational capabilities in mobile stations that can present a significant drain on power. Advances in speech coding algorithmic implementations and low-power integrated circuits have provided some improvement at the cost of speech quality, however, issues of performance remain where there is a lot of background noise, such as noise from a car, a crowd or nonspeech sounds, such as music. With the increased usage of wireless communications systems, the demands of wireless subscribers for speech quality that is comparable to that of land-based networks have similarly increased. In addition, the speech coders must be robust, able to withstand high bit-error rates and burst errors without causing instabilities and subjecting the user to annoying effects. In radio channels, occasional long error bursts during deep fades are produced, resulting in correlated speech frame erasures. The codec should be able to estimate the lost speech frames with minimal loss in speech quality. This is particularly important in PCS systems, were the percentage of frame erasures is a measured system parameter. The ability of the codec to tolerate higher frame erasure rates has a significant impact on the efficiency of such systems.
Code excited linear predictive (CELP) coding has been extensively investigated as a promising algorithm to provide good quality at low bit rates. CELP coding is based on vector quantization and the fact that positions on the spectral "grid" of speech are redundant. The most likely positions on the grid are represented by a vector, and all of the vectors are stored in a codebook at both the analyzer and synthesizer. In accordance with this method, the speech signal is sampled and converted into successive blocks of a predetermined number of samples. Each block of samples is synthesized by filtering an appropriate innovation sequence from the codebook, scaled by a gain factor, through two filters having transfer functions varying in time. The first filter is a Long Term Predictor filter (LTP), or pitch filter, for modeling the pseudo-periodicity of speech due to pitch. The second filter is a Short Term Predictor filter (STP), which models the spectral characteristics of the speech signal. The encoding procedure used to determine the pitch and excitation codebook parameters is an Analysis-by-Synthesis (AbS) technique. AbS codecs work by splitting the speech to be coded into frames, typically about 20 msec. long. For each frame, parameters are determined for a synthesis filter, then the excitation for this filter is determined. This is done by finding the excitation signal which, when passed into the given synthesis filter, minimizes the error between the input speech and the reconstructed speech. The synthetic output is computed for all candidate innovation sequences from the codebook. The retained codeword is the one corresponding to the synthetic output which has the lowest error relative to the original speech signal according to a perceptually weighted distortion measure. This codeword is then transmitted to the receiver with the speech signal, along with a gain term.
Typically, the CELP codebook searches are computationally intensive and require a significant amount of memory storage capacity. This problem is particularly troublesome in wideband applications where larger frame sizes and, thus, larger codebooks, are needed.
There are a number of variations on CELP techniques, each providing different algorithms for establishing a pre-defined structure which is directed toward reducing the number of computations required for the codebook search process. One such CELP method, Algebraic CELP (ACELP) uses a sparse algebraic code and a focused search approach in order to reduce the number of computational steps. This technique is described by J-P. Adoul and C. LaFlamme in U.S. Pat. No. 5,444,816 and is further detailed in an article co-authored by the same inventors entitled "A Toll Quality 8Kb/s Speech Codec for the Personal Communications System (PCS)", IEEE Trans. On Veh. Tech., Vol. 43, No. 3, August 1994, p. 808-816. Both disclosures are incorporated herein by reference.
Variations of ACELP codecs of the type Enhanced Full Rate (EFR)-ACELP, have been adopted for use in PCS and GSM networks. One such codec is described in ANSI J-STD 007 Air Interface Volume 3, "Enhanced Full Rate Codec". Another ACELP codec is described in Telecommunications Industry Association/Electronics Industries Association Interim Standard 641 (TIA/EIA/IS-641), "TDMA Cellular/PCS--Radio Interface--Enhanced Full-Rate Speech Codec". A low-level description of the PCS-1900 enhanced GSM full-rate ACELP (EFR-ACELP) operating at 13 kb/s is provided in a Draft Recommendation dated April 1995 (Version 1.1), which has been distributed to the industry for comment and voting. Both standards and the Draft Recommendation are incorporated herein by reference.
In the EFR-ACELP codec, the codebook is in the form of matrices containing the correlation coefficients, i.e., the indices of codewords, for synthesizing the speech vectors to obtain the excitation. The size of the matrix is determined by the length of the vectors stored therein. In the wideband applications of PCS, the weighted synthesis filter impulse response and the sample sign are each length 40 vectors, which results in an autocorrelation matrix which is 40.times.40. The correlation coefficients are computed recursively starting at the lower right corner of the matrix (39,39) and along the diagonals. This matrix, which is symmetrical along its main diagonal, represents one of the largest dynamic variables in EFR-ACELP codec implementation. While the matrix enables simple access to individual elements, it uses a significant amount of memory (1600 words) in devices where memory space on the digital signal processor (DSP) is limited. Alternative storage schemes, such as storing one-half of the matrix, would require complex addressing schemes to access individual elements of the matrix.
Accordingly, a need remains for effective implementation of EFR-ACELP for a means for retaining the advantageous search capabilities of established ACELP techniques while reducing demands on the storage capacity of the DSP which is performing the encoding/decoding. The invention described herein addresses this need.