The present invention concerns the coding of audiofrequency signals, and more particularly coding, decoding and also transcoding methods based on a spectral decomposition of the signal. At least one part of such a codec operates on the spectral components of the signal, which could result from a decomposition into sub-bands or from a frequency transform (Fourrier transform, cosine transform . . . ). The present description of the invention will here below concentrate on, but will not be limited to, the context of a codec employing a decomposition into sub-bands.
The invention considers the problem of scalability of the digital data stream transmitted between a coder and a decoder. This property consists in the ability of the coder to construct variable rate output data streams on the basis of the same coding scheme applied to the coded digital signal, and in the corresponding ability of the decoder to reconstruct a faithful version of the signal.
The difficulty here is to serve the highest possible coding quality for each data rate value without unduly increasing the complexity of the circuits used.
Data stream scalability is of particular importance where the data stream is likely to be carried on packet switching networks, such as networks operating according to the IP (Internet Protocol). Historically, the majority of coders have been developed for broadcasting or communications applications in circuit mode, leading to fixed rate coders or coders with the rate selected from several possible values when the connection is set up. In the packet mode context, it is better that the rate should vary more dynamically, so that the data stream can be matched to the congestion encountered when the packets are conveyed while ensuring that the communication is maintained.
Patent application WO 97/21211 describes a multichannel audio signal coding system, the signal to be coded corresponding in fact to a group of five signals associated, respectively, with five channels, in which the signal associated with each channel is decomposed into thirty two sub-bands. A technique called joint frequency coding which relies on correlations that exist between different high-frequency channels, allows to reduce the total number of bits globally assigned to the coding of the higher frequency sub-bands, by considering that these sub-bands carry identical information for each channel so that it suffices to transmit this information only once. However, this technique does not allow the data transmission rate to be reduced for a single signal.
U.S. Pat. No. 4,790,016 describes a voice signal coding process in which the signal to be coded is sampled to produce n samples which are normalised and processed by a Fast Fourrier transform (FFT) to produce (n/2)+1 complex coefficients. Only certain coefficients are quantified and transmitted depending on the scale factor of the corresponding sub-bands, the coefficients of the sub-bands which are not transmitted being approximated at the receiver end by assigning to them the value of other sub-band coefficients which have been transmitted.
One of the main objectives of the present invention is to achieve a high degree of precision in the sampling capability of the digital signal, which will allow the best compromise in rate versus quality to be sought depending on the communication conditions.
A first embodiment of the invention thus relates to a method for the decoding of a digital data stream representing an audio signal, where at least one group of spectral components is calculated using the vector quantization index contained in the data stream, and the said group of spectral components are combined during the reconstruction of one version of the decoded audio signal, each component of the set being associated with a set of vector quantization indices used to calculate this component, and where the digital data stream includes an identification code for at least one pair of spectral components, each pair identified consisting of a first and a second component, and where the second component of at least one identified pair is associated with a set of vector quantization indices of which at least some are copied from a set of indices read from the data stream and which are associated with the first spectral component of the said identified pair.
This identification code is included in the digital data stream either by he coder generating the data signal, or by a transcoder which processes the signal between the coder and the decoder.
A second embodiment of the invention relates to a method for the decoding of a digital data stream representing an audio signal, in which at least one group of spectral components is calculated using the vector quantization index contained in the data stream, and the said group of spectral components are combined during the reconstruction of one version of the decoded audio signal, each component of the set being associated with a set of vector quantization indices used to calculate this component, where the digital data stream has a variable bit rate, and there is a first phase in which the digital signal carries respective sets of vector quantization indices for the calculation of a first group of spectral components and in which the correlations between spectral components of the first group are analysed, and a second phase, of lower frequency than the first, in which the digital signal carries respective sets of vector quantization indices for calculating one part only of the spectral components of the first group and where at least one spectral component of the first group not belonging to the said part is calculated from indices copied, at least partially, from a set of vector quantization indices read from the digital data stream and associated with a component belonging to the said part, for which a maximum correlation was determined in the first phase.
Thus, some of the spectral components can be calculated without the necessity of all the corresponding vector quantization indices appearing explicitly in the transmitted digital data stream.
If a reduction in digital bit rate is required, it is possible to omit from the transmission at least a part of the vector, quantization indices relative to one or more of the bands, while still conserving the relevant components, albeit with a reduced precision, in the reconstruction of the signal by the decoder, such that the loss of quality is limited.
These advantages make the process particularly suited to variable bit rate codecs. In addition, it allows a greater precision to be achieved in the permitted frequency variations. In one typical application of this embodiment the spectral components of the group are processed by sequential segments, each successive segment of a spectral component being determined from the product of a library waveform and a gain, the said waveform and said gain being identified by their respective vector quantization indices belonging to the associated set of indices. In this case, the vector quantization indices for a spectral component comprising the vector quantization indices of waveforms relevant to this spectral component could be copied, while the vector quantization indices of the corresponding gains could be read independently in the data stream.
A third embodiment of the present invention relates to a method for coding an audio signal, in which at least one group of spectral components is obtained from the audio signal itself, and a digital data stream, which includes the vector quantization indices of at least some of the spectral components, is generated at the output. According to the invention, at least one pair of components exhibiting maximum correlation out of the group of spectral components is selected, and an identification code for each pair of components selected is included in the digital output signal, at least some of the vector quantization indices being include in the output data stream for only one component of the pair of components selected.
In a corresponding manner, the invention process an audio coder consisting of a means of generating at least one group of spectral components from an audio signal, a means of calculating the vector quantization index of at least some of the spectral components of the group, a multiplexer producing a digital data stream output including at least some of the vector quantization indices calculated, and a means of analysis to select at least one pair of components out of the group of spectral components exhibiting a maximum correlation, the multiplexer being instructed to include an identification code for each pair of components selected in the data stream, and in such a manner that at least some of the vector quantization indices be included in the data stream output for only one of the component of each pair selected.
In one embodiment of the coder, a means of producing the group of spectral components from the audio signal comprises a bank of filters for the decomposition of the audio signal into frequency sub-bands.
In a variant of this, the means could consist of a first stage for coding the audio signal, and a bank of filters for the decomposition of a residual error signal produced by the first coder stage into sub-bands (see Patent application WO 99/03210).
It could also include a circuit to perform a frequency transform of at least a part of the signal.
The invention relates equally to transcoding processes which could be adapted to the coding and/or decoding processes described here below. The invention globally proposes a method for the transcoding of an input digital data stream, which represents an audio signal coded by successive time windows, in which a lower bit rate output data screen will be produced where, for a signal window represented by a number, A, of ordered bits, the output data stream is formed by a copy of a number (Axe2x88x92B) of input data bits with the suppression of a number, B, of input data bits in order to reduce the data rate, and where the bits to be suppressed from the input data stream are determined by information received from a decoder to which the digital output data stream is routed.
This transducing process, which serves to reduce the digital data rate, is not simply based on suppressing the last few bits of each time window, which is often not optimal, but is designed to limit the loss of quality, which is inherent in the reduction of the data rate, by making use of information received from the downstream data decoder.
In a similar manner to the coding process described here above, where the input digital signal contains sets of vector quantization indices associated with groups of spectral components, respectively, the transcoding process could consist of the following stages: select at least one pair of components exhibiting maximum correlation out of the whole group of spectral components, keep the set of vector quantization indices associated with one of the components of each pair selected, and suppress at least some of the indices from the set of vector quantization indices associated with the other component of each pair selected.
In a similar manner to the coding process described here above, where the input digital signal contains sets of quantification indices associated with groups of spectral components, respectively, the transcoding process could consist of the following stages: select at least one pair of components exhibiting maximum correlation out of the whole group of spectral components, keep the set of quantification indices associated with one of the components of each pair selected, and suppress at least some of the indices from the set of quantification indices associated with the other component of each pair selected.
Where the input digital signal carries sets of vector quantization indices associated, respectively, with groups of spectral components, another option is that the suppressed bits include at least one set of vector quantization indices associated with a spectral component.
In the latter case, the transcoding process operates simply by completely suppressing certain sub-bands.
The sub-bands to be suppressed can be determined by, and based on, the nature of the coded audio signal or by the properties of the decoder used to reconstitute it (for example, the audio signal bandwidth or bandwidth capability of the decoder).
They are in any case determined on the basis of information received from the decoder. In particular, in the above-mentioned case where a variable frequency decoder analyses a data stream at a relatively high bit rate looking for correlations between bands, the information thus obtained can be sent to the transcoder. In this way, the bands to be suppressed to achieve a reduction in bit rate can be chosen judiciously, that is in such a way that the audio signal reconstituted by the decoder using the index copy process will not be subject to an appreciable loss of quality.
In another embodiment of the transcoding process, the input digital data stream includes, for each time window of the signal, at least one index which will allow a coding parameter vector to be selected from a library containing 2Q vectors which can be used to reconstitute a version of the decoded signal, the said index included in the input data stream consisting of (Qxe2x88x92q) bits which, with the addition of q arbitrary bits in predetermined positions, defines a set of 2q vector addresses in the reference library, q being an integer such that q greater than 0. In the output data stream, the (Qxe2x88x92q) bit index is replaced by a translated index composed of (Qxe2x88x92p) bits which, with the addition of p arbitrary bits in predetermined positions, defines a set of 2p addresses containing the said set of 2q addresses, p being an integer that q less than p less than Q. Thus, the transcoding process operates by the preferential suppression of the least significant bits for the vector quantization index, so that the impact of the reduction in the bit rate on the quality of the signal that the decoder can reconstitute is minimised.