This invention relates to speech coders.
The invention finds particular, though not exclusive, application in telecommunications systems.
According to one aspect of the invention there is provided a speech coder including an encoder for encoding an input speech signal divided into frames each consisting of a predetermined number of digital samples, the encoder including: linear predictive coding (LPC) means for analysing samples and generating at least one set of linear prediction coefficients for each frame; pitch determination means for determining at least one value of pitch for each frame, the pitch determination means including first estimation means for analysing samples using a frequency domain technique (frequency domain analysis), second estimation means for analysing samples using a time domain technique (time domain analysis) and pitch evaluation means for using the results of said frequency domain and time domain analyses to derive a said value of pitch; voicing means for defining a measure of voiced and unvoiced signals in each frame; amplitude determination means for generating amplitude information for each frame, and quantisation means for quantising said set of linear prediction coefficients, said value of pitch, said measure of voiced and unvoiced signals and said amplitude information to generate a set of quantisation indices for each frame, wherein said first estimation means generates a first measure of pitch for each of a number of candidate pitch values, the second estimation means generates a respective second measure of pitch for each of said candidate pitch values and said evaluation means combines each of at least some of the first measures with the corresponding said second measure and selects one of the candidate pitch values by reference to the resultant combinations.
According to another aspect of the invention there is provided a speech coder including an encoder for encoding an input speech signal, the encoder comprising means for sampling the input speech signal to produce digital samples and for dividing the samples into frames each consisting of a predetermined number of samples, linear predictive coding (LPC) means for analysing samples and generating at least one set of linear prediction coefficients for each frame, pitch determination means for determining at least one value of pitch for each frame, voicing means for defining a measure of voiced and unvoiced signals in each frame, amplitude determnination means for generating amplitude information for each frame, and quantisation means for quantising said set of linear prediction coefficients, said value of pitch, said measure of voiced and unvoiced signals and said amplitude information to generate a set of quantisation indices for each frame, wherein said pitch determination means includes pitch estimation means for determining an estimate of the value of pitch and pitch refinement means for deriving the value of pitch from the estimate, the pitch refinement means defining a set of candidate pitch values including fractional values distributed about said estimate of the value of pitch determined by the pitch estimation means, identifying peaks in a frequency spectrum of the frame, for each said candidate pitch value correlating said peaks with amplitudes at different harmonic frequencies (kxcfx89o) of a frequency spectrum of the frame, where             ω      o        =                  2        ⁢        π            P        ,
P is a said candidate pitch value and k is an integer, and selecting as a said value of pitch the candidate pitch value giving the maximum correlation.
According to a further aspect of the invention there is provided a speech coder including an encoder for encoding an input speech signal, the encoder comprising means for sampling the input speech signal to produce digital samples and for dividing the samples into frames, each consisting of a predetermined number of samples, linear predictive coding (LPC) means for analysing samples and generating at least one set of linear prediction coefficients for each frame, pitch determination means for determining at least one value of pitch for each frame, voicing means for determining for each frame a voicing cut-off frequency for separating a frequency spectrum from the frame into a voiced part and an unvoiced part without evaluating the voiced/unvoiced status of individual harmonic frequency bands, amplitude determination means for generating amplitude information for each frame, and quantisation means for quantising said set of coefficients, said value of pitch, said voicing cut-off frequency and said amplitude information to generate a set of quantisation indices for each frame.
According to a yet further aspect of the invention there is provided a speech coder including an encoder for encoding an input speech signal, the encoder comprising, means for sampling the input speech signal to produce digital samples and for dividing the samples into frames each consisting of a predetermined number of samples, linear predictive coding (LPC) means for analysing samples and generating at least one set of linear prediction coefficients for each frame, pitch determination means for determining at least one value of pitch for each frame, voicing means for defining a measure of voiced and unvoiced signals in each frame, amplitude determination means for generating amplitude information for each frame, and quantisation means for quantising said set of prediction coefficients, said value of pitch, said measure of voiced and unvoiced signals and said amplitude information to generate a set of quantisation indices for each frame, wherein the amplitude determination means generates, for each frame, a set of spectral amplitudes for frequency bands centred on frequencies harmonically related to the value of pitch determined by the pitch determination means, and the quantisation means quantises the normalised spectral amplitudes to generate a first part of an amplitude quantisation index.
According to a yet further aspect of the invention there is provided a speech coder including an encoder for encoding an input speech signal, the encoder comprising means for sampling the input speech signal to produce digital samples and for dividing the samples into frames each consisting of a predetermined number of samples, linear predictive coding means for analysing samples to generate a respective set of Line Spectral Frequency (LSF) coefficients for a leading part and for a trailing part of each frame, pitch determination means for determining at least one value of pitch for each frame, voicing means for defining a measure of voiced and unvoiced signals in each frame, amplitude determination means for generating amplitude information for each frame, and quantisation means for quantising said sets of LSF coefficients, said value of pitch, said measure of voiced and unvoiced signals and said amplitude information to generate a set of quantisation indices, wherein said quantisation means defines a set of quantised LSF coefficients (LSFxe2x80x22) for the leading part of the current frame by the expression
LSFxe2x80x22=xcex1LSFxe2x80x21+(1xe2x88x92xcex1) LSFxe2x80x23,
where LSFxe2x80x23 and LSFxe2x80x21 are respectively sets of quantised LSF coefficients for the trailing parts of the current frame and the frame immediately preceding the current frame, and xcex1 is a vector in a first vector quantisation codebook, defines each said set of quantised LSF coefficients LSFxe2x80x22,LSFxe2x80x23 for the leading and trailing parts respectively of the current frame as a combination of respective LSF quantisation vectors Q2,Q3 of a second vector quantisation codebook and respective prediction values P2,P3, where P2=xcexQ1 and P3=xcexQ2, xcex is a constant and Q1 is a said LSF quantisation vector for the trailing part of said immediately preceding frame, and selects said vector Q3 and said vector a from the first and second vector quantisation codebooks respectively to minimise a measure of distortion between the LSF coefficients generated by the linear predictive coding means (LSF2, LSF3) for the current frame and the corresponding quantised LSF coefficients (LSFxe2x80x22, LSFxe2x80x23).
According to yet a further aspect of the invention there is provided a speech coder for decoding a set of quantisation indices representing LSF coefficients, pitch value, a measure of voiced and unvoiced signals and amplitude information, including processor means for deriving an excitation signal from said indices representing pitch value, measure of voiced and unvoiced signals and amplitude information, a LPC synthesis filter for filtering the excitation signal in response to said LSF coefficients, means for comparing pitch cycle energy at, the LPC synthesis filter output with corresponding pitch cycle energy in the excitation signal, means for modifying the excitation signal to reduce a difference between the compared pitch cycle energies and a further LPC synthesis filter for filtering the modified excitation signal.