1. Technical Field
The present invention relates generally to speech coding; and, more particularly, it relates to hybrid extraction of linear prediction coefficients as a function of frequency within speech data.
2. Related Art
Conventional speech coding systems that employ linear prediction speech coding, such as code-excited linear prediction speech coding, uses methods based on minimizing the prediction error energy associated with the linear prediction coefficients (LPCs) generated during the encoding of a speech signal, such as the auto-correlation method. This conventional method is inherently an energy driven system. For typical broad band signals that are frequently present within speech coding systems, the linear prediction coefficients (LPCs) are very representative of the speech signal, but for speech signals having a widely dispersed power spectral density, the spectral information in one portion of the speech signal is commonly under-represented by the linear prediction coefficients (LPCs) and its associated parameters. This under-representation provides an undesirably poor speech quality when the speech signal is later reproduced in the speech coding system.
Specifically, one concern for conventional speech coding systems is that when there is a large disparity between the energy levels across the frequency spectrum of the speech signal, the conventional methods of speech coding that generate a single set of linear prediction coefficients (LPCs) for the speech signal fail to provide a high perceptual quality upon subsequent reproduction of the speech signal.
Further limitations and disadvantages of conventional and traditional systems will become apparent to one of skill in the art through comparison of such systems with the present invention as set forth in the remainder of the present application with reference to the drawings.
Various aspects of the present invention can be found in a speech codec that performs linear prediction speech coding on a speech signal. The speech codec includes, among other things, an encoder circuitry and a decoder circuitry that are communicatively coupled via a communication link. The encoder circuitry receives the speech signal that is provided to the speech codec. In addition, the speech codec contains a linear prediction coefficient parameter extraction circuitry that extracts two sets of linear prediction coefficients during the coding of the speech signal and a linear prediction coefficient combination circuitry that combines the two sets of linear prediction coefficients to generate a hybrid set of linear prediction coefficients.
The linear prediction coefficient parameter extraction circuitry itself contains a high frequency speech signal processing circuitry and a low frequency speech signal processing circuitry. The high frequency speech signal processing circuitry extracts a set of linear prediction coefficients representing better a high frequency component of the speech signal, and the low frequency speech signal processing circuitry extracts a set of linear prediction coefficients representing better a low frequency component of the speech signal.
The linear prediction coefficient combination circuitry takes as input the two sets of linear prediction coefficients and performs appropriate hybrid combination in order to generate a new set of linear prediction coefficients (LPCs) to be used by the speech codec. In certain embodiments of the invention, the two sets of linear prediction coefficients are first converted to the line spectral frequency (LSF) domain, then a hybrid combination in line spectral frequency (LSF) domain takes place to obtain a combined set of line spectral frequencies (LSFs), which is converted back to the linear prediction coefficient (LPC) domain to obtain the hybrid combined set of linear prediction coefficients (LPCs). In other embodiments of the invention, the hybrid combination might take place in other parameter domains, such as reflection coefficients, auto-correlation coefficients, or even in the original speech signal domain. It is understood that proper parameter conversions back and forth and appropriate weighting function for the combination are necessary and essential.
In certain embodiments of the invention, the speech codec further calculates a set of line spectral frequencies (LSF) from the calculated linear prediction coefficients (LPCs). The line spectral frequencies are used by the linear prediction coefficient combination circuitry to perform the hybrid combination of the two sets of linear prediction coefficients. The final set of linear prediction coefficients corresponds to a hybrid combination of the sets of linear prediction coefficients. In other embodiments of the invention, the speech codec further determines speech signal spectral information from the speech signal, and wherein the speech signal spectral information from the speech signal is used by the linear prediction coefficient parameter extraction circuitry to perform the combination of the two sets of linear prediction coefficients.
The linear prediction coefficient combination circuitry combines the two sets of linear prediction coefficients to generate a hybrid set of linear prediction coefficients by employing a weighted averaging to combine the two sets of linear prediction coefficients. The linear prediction coefficient parameter extraction circuitry extracts at least one additional set of linear prediction coefficients during the coding of the speech signal in certain embodiments of the invention. The linear prediction coefficient combination circuitry that combines the two sets of linear prediction coefficients to generate a hybrid set of linear prediction coefficients employs a weighted averaging to combine the two sets of linear prediction coefficients and to produce the at least one additional set of linear prediction coefficients. If desired, the entirety of the speech codec is contained within a speech signal processor.
Other aspects of the present invention can be found in a speech coding system that performs hybrid extraction of linear prediction coefficients (LPCs) during coding of a speech signal. The speech coding system itself contains, among other things, a linear prediction coefficient parameter extraction circuitry and a linear prediction coefficient combination circuitry. The linear prediction coefficient parameter extraction circuitry extracts at least two sets of linear prediction coefficients during the coding of the speech signal, and the linear prediction coefficient combination circuitry combines the at least two sets of linear prediction coefficients to generate a hybrid set of linear prediction coefficients.
In certain embodiments of the invention, the speech coding system further determines the spectral content of the speech signal after first having generated the linear prediction coefficients (LPCs), and the spectral content of the speech signal is used by the linear prediction coefficient parameter extraction circuitry to perform the combination of the sets of linear prediction coefficients (LPCs). The speech codec calculates a set of line spectral frequencies using the linear prediction coefficients (LPCs), and the line spectral frequencies are used by the linear prediction coefficient combination circuitry to perform the hybrid combination of the sets of linear prediction coefficients (LPCs). One of the at least two sets of linear prediction coefficients corresponds to a pre-emphasized component of the speech signal. If desired, the entirety of the speech coding system is contained within a speech signal processor.
In other embodiments of the invention within the speech coding system, one of the at least two sets of linear prediction coefficients corresponds to a high frequency component of the speech signal extracted using a high pass tilted filter, the other of the at least two sets of linear prediction coefficients corresponds to a low frequency component of the speech signal extracted using a low pass tilted filter. When the speech coding system is contained within a speech codec having an encoder circuitry and a decoder circuitry, the linear prediction coefficient parameter extraction circuitry and the linear prediction coefficient combination circuitry are contained in the encoder circuitry of the speech codec.
Other aspects of the present invention can be found in a method that performs hybrid extraction of linear prediction coefficients from a speech signal. The method involves calculating a first and a second set of linear prediction coefficients from the speech signal, and combining the first set of linear prediction coefficients and the second set of linear prediction coefficients to generate a hybrid set of linear prediction coefficients.
In certain embodiments of the invention, the method further includes calculating an additional set of linear prediction coefficients from the speech signal, and combining the first set of linear prediction coefficients and the second set of linear prediction coefficients with the at least one additional set of linear prediction coefficients to generate a hybrid set of linear prediction coefficients. In addition, the method includes calculating a first set and a second set of line spectral frequencies using the linear prediction coefficients (LPCs) that are generated from the speech signal. For example, the first set of line spectral frequencies are calculated using the first set of linear prediction coefficients (LPCs), and the second set of line spectral frequencies are calculated using the second set of linear prediction coefficients (LPCs). Also, when combining the first set of linear prediction coefficients (LPCs) and the second set of linear prediction coefficients to generate a hybrid set of linear prediction coefficients (LPCs), a weighted filter is applied to the first set of linear prediction coefficients and the second set of linear prediction coefficients (LPCs).