A voice signal is a sound signal emitted by a human vocal tract.
A codec is a hardware and/or software device for coding and decoding a digital stream. Its coding function transcodes a digital stream of quantized samples of a source signal (a voice signal) in the time domain into a compressed digital stream. Its decoder function effects a pseudoconverse operation with the objective of restoring attributes representative of the signal source, for example attributes perceptible to a receiver such as the human ear.
A voice data stream is a data stream generated by a voice codec when coding a voice signal. A transparent data stream is a binary digital sequence of unspecified content type (computer data or voice data). The data is referred to as “transparent” in the sense that, from an external point of view, all the bits are of equal importance in relation to the correction of transmission errors, for example, so that error corrector coding must be uniform for all the bits. Conversely, if the stream is a stream of voice bits, some bits are more important to protect than others.
A voice (or speech) codec, also referred to as a vocoder, is a dedicated codec adapted to code a quantized voice signal and to decode a stream of voice frames. In particular, its coding function has a sensitivity that depends on the characteristics of the voice of the speaker and a low bit rate associated with a frequency band that is narrower than the general audio frequency band (20 Hz-20 kHz).
There are several families of voice coding techniques, including techniques for coding the waveform of the voice signal (for example ITU-T G.711 PCM A/μ law coding), source model coding techniques, of which code-excited linear prediction (CELP) coding is the best known, perceptual coding, and hybrid techniques based on combining techniques belonging to two or more of the above families.
The invention aims to apply source model coding techniques, which are also known as parametric coding techniques, because they are based on the representation of excitation parameters of the voice source and/or of parameters describing the spectral envelope of the signal emitted by the speaker (for example a linear prediction coding model exploiting the correlation between consecutive values of parameters associated with a synthesis filter or a cepstral model) and/or of sound parameters depending on the source, for example the amplitude and the perceived fundamental center frequency (“pitch”), the pitch period, and the amplitude of the energy peaks of the first harmonics of a pitch frequency at different intervals, its voicing rate, its melodic qualities, and its stringing characteristics.
A parametric vocoder uses digital voice coding employing a parametric model of the voice source. In practice, a parametric vocoder associates a plurality of parameters with each frame of the voice stream, firstly linear prediction (LP) spectrum parameters, also known as LP coefficients, for example, or linear prediction coding (LPC) coefficients, which define a linear prediction filter of the vocoder (short-term filter); secondly, adaptive excitation parameters associated with one or more adaptive excitation vectors, which are also known as long-term prediction (LTP) parameters or adaptive prediction coefficients, and which define a long-term filter in the form of a first excitation vector and an associated gain to be applied at the input of the synthesis filter; and thirdly fixed excitation parameters associated with one or more fixed excitation vectors, which are also known as algebraic parameters or stochastic parameters, and which define a second excitation vector and an associated gain to be applied at the input of the synthesis filter.
The document EP-A-1 020 848 discloses a method of transmitting auxiliary information in a main information stream corresponding to a voice signal, said auxiliary information being inserted in a CELP vocoder that codes the voice signal, replacing the index of the adaptive excitation vector and/or the index of the fixed excitation vector. To be more precise, the auxiliary information bits are inserted in the vocoder of the sender in place of bits normally coding the corresponding index and the value of the gain is set to zero in order to advise the vocoder of the receiver of this substitution.
One drawback is that inserting an auxiliary information stream into the main information stream is not discreet, in that it is sufficient to note the zero value of the gain to know that the bits normally allocated to coding the associated index in fact contain auxiliary information. This is considered to be a drawback of the method when used in a system in which transmission confidentiality is important.