The present application relates to bandwidth reduction of speech signals and, more particularly, to a residual-excited linear predictive vocoder in which a novel method for pitch-aligned regeneration of high-frequency signal portions reduces the totality of speech quality defects in the reconstituted speech signal.
Present day radio communications requires that minimum bandwidth be utilized for signal transmission. In the transmission of human speech signals, bandwidth compression, by digital encoding and decoding, often utilizes the linear predictive coding (LPC) of speech. One desirable form of the LPC vocoder is the residual-excited type. This residual-excited linear-predictive-coding (RELP) vocoder often suffers from a variety of speech quality defects, with perhaps the most noticeable problem resulting from tonal noises due to the misalignment of pitch harmonics during high frequency regeneration (HFR) in the receiver-decoder. The HFR problem in RELP vocoders has been widely discussed in the literature; many proposed solutions, spanning a large complexity range, have been identified. Simple HFR solution techniques include: (1) spectral folding, or up-sampling, in which the baseband is periodically duplicated in frequency, to produce a total of P copies, where P is an integer decimation ratio, with relatively easy implementation, as only simple up-sampling and no interpolation filter are required; or (2) instantaneous non-linearities, as, for example, produced by rectification and alike. Because of the simple folding aspect of the spectral folding method, the apparent pitch "harmonics" of reconstituted voiced speech do not necessarily fall in a normal harmonic sequence, so that spectral lines and holes appear at improper frequencies and produce annoying tonal noises; this effect is perhaps most pronounced for female speakers. The non-linearity methods, while producing correctly-aligned pitch harmonics, add a somewhat harsh and rough quality to the speech. Both methods result in greatest quality degradation for voiced speech. Of the more complex schemes which have been hitherto designed to alleviate the HFR problem, typical examples are the use of: fast Fourier transformation and pitch detection to transmit a variable-width baseband in order to produce aligned pitch harmonics; fast Fourier transformation and subsequent computation of correlation coefficients between the baseband and high frequency bands for proper high frequency regeneration; or full band pitch prediction, to effectively remove the pitch information before decimation and to restore the pitch information after up-sampling. These, and other, relatively complex methods provide very good recovered speech quality, although such methods require a relatively large amount of digital signal processing speed, memory and other factors, which preclude implementation in a single digital signal processor (DSP) integrated circuit, such as the NEC 7720 or the TI TMS320 integrated circuits and the like. It is therefore highly desirable to provide a relatively low complexity method for providing a true alignment solution to the high frequency regeneration HFR problem, which HFR method can be implemented in a single DSP integrated circuit, preferably in the receiver stage, and preferably without requiring a change in either the vocoder transmitter stage, or in bit rate overhead.