1. Technical Field
The present invention relates generally to speech coding; and, more particularly, it relates to target signal reference shifting within speech coding.
2. Related Art
Conventional speech coding systems tend to require relatively significant amounts of bandwidth to encode speech signals. Using conventional code-excited linear prediction techniques, waveform matching between a reference signal, an input speech signal, and a re-synthesized speech signal are all used as error criteria to perform speech coding of the speech signal. To provide a high perceptual quality of the re-synthesized speech signal, the relatively significant amounts of bandwidth are required within conventional speech coding systems. Specifically, to perform good matching and thereby providing a high perceptual quality of the re-synthesized speech signal, a high bit-rate is used to encode the fractional pitch lag delay during the calculation of pitch prediction. This use of relatively significant amounts of bandwidth, as necessitated to provide this high perceptual quality, are inherently costly and wasteful to low bitrate applications. This highly consumptive use of the available bandwidth is very undesirable for low bit-rate applications. The present art does not provide an adequate solution to encode the fractional pitch lag delay during the calculation of pitch prediction within conventional speech coding systems.
As speech coding systems continue to move toward lower bit-rate applications, the traditional solution of dedicating a high amount of bandwidth to the coding of the fractional pitch lag delay will prove to be one of the limiting factors, especially of those speech coding systems employing code-excited linear prediction speech coding. The inherent speech coding performed within the code-excited linear prediction speech coding method does not afford a good opportunity to reduce the bandwidth dedicated to coding the fractional pitch lag delay while still maintaining a high perceptual quality of reproduced speech, i.e., high perceptual quality of the re-synthesized speech signal.
Traditional methods of speech coding that use a target signal (Tg) to find an adaptive codebook gain (gp) within code-excited linear prediction speech coding commonly calculate the target signal (Tg) by matching old frame of the speech signal to a new or current frame of the speech signal. This matching gives an adaptive codebook contribution (Cp) and subsequently the contribution provided by a speech synthesis filter (H) with it as shown by the following relation
Cpxe2x86x92CpH
Subsequently, using the calculated target signal (Tg) and the combined contribution of the contribution (Cp) and the speech synthesis filter (H), namely CpH. then the adaptive codebook gain (gp) is uniquely solved by the following relation.
gp←Min(Tgxe2x88x92gpCpH)2
Further limitations and disadvantages of conventional and traditional systems will become apparent to one of skill in the art through comparison of such systems with the present invention as set forth in the remainder of the present application with reference to the drawings.
Various aspects of the present invention can be found in a code-excited linear prediction speech coding system that performs target signal reference shifting during encoding of a speech signal. The code-excited linear prediction speech coding system itself contains, among other things, a speech synthesis filter and the speech synthesis filter contains a linear prediction coding synthesis filter and a perceptual weighting filter. The speech synthesis filter generates a target signal during encoding of the speech signal using the linear prediction coding synthesis filter and the perceptual weighting filter. In addition, the code-excited linear prediction speech coding system generates a modified target signal using the target signal that is generated during the encoding of the speech signal, and the code-excited linear prediction speech coding system generates an encoded speech signal during the encoding of the speech signal. Also, the code-excited linear prediction speech coding system is operable to decode the encoded speech signal to generate a reproduced speech signal, the reproduced speech signal is substantially perceptually indistinguishable from the speech signal prior to the encoding of the speech signal.
In certain embodiments of the invention, the code-excited linear prediction speech coding system is found within a speech codec. In some instances, the speech codec contains, among other things, an encoder circuitry and a decoder circuitry, and the modified target signal is generated within the encoder circuitry. If desired, the encoding of the speech signal is performed on a frame basis. Alternatively, the encoding of the speech signal is performed on a sub-frame basis. Within speech coder applications, the reproduced speech signal is generated using the modified target signal. In addition, the code-excited linear prediction speech coding system is operable within a speech signal processor. The code-excited linear prediction speech coding system is operable within a substantially low bit-rate speech coding system.
Other aspects of the present invention can be found in a speech coding system that performs target signal reference shifting of a speech signal. The speech coding system contains, among other things, a target signal calculation circuitry that generates a target signal and an adaptive codebook gain calculation circuitry that generates an adaptive codebook gain. The target signal corresponds to at least one portion of the speech signal, and the adaptive codebook gain is generated using the modified target signal.
Similar to the aspects of the invention can be found in the code-excited linear prediction speech coding system described above, the speech coding system of this particular embodiment of the invention is found with in a speech codec in certain embodiments of the invention. When the speech codec contains encoder circuitry, the speech coding system is contained within the encoder circuitry. Also, the speech coding system is operable within a speech signal processor.
In other embodiments of the invention, the speech coding system contains a speech synthesis filter. The speech synthesis filter contains a linear prediction coding synthesis filter and a perceptual weighting filter. If desired, the at least one portion of the speech signal that is used to encode the speech signal is extracted from the speech signal on a frame basis. Alternatively, the at least one portion of the speech signal that is used to encode the speech signal is extracted from the speech signal on a sub-frame basis. The speech coding system is operable within a substantially low bit-rate speech coding system.
Other aspects of the present invention can be found in a method that is used to perform target signal reference shifting on a speech signal. The method includes, among other things, calculating a target signal, modifying the target signal to generate a modified target signal, and calculating an adaptive codebook gain using the modified target signal. The target signal corresponds to at least one portion of the speech signal.
In certain embodiments of the invention, the method is performed on the speech signal on a frame basis; alternatively, the method is performed on a sub-frame basis. The generation of the modified target signal includes maximizing a correlation between the target signal and a product of an adaptive codebook contribution and a speech synthesis filter contribution. If further desired, the correlation is normalized during its calculation. The method is operable within speech coding system that operate using code-excited linear prediction.
Other aspects, advantages and novel features of the present invention will become apparent from the following detailed description of the invention when considered in conjunction with the accompanying drawings.