There are various techniques for converting into digital form and compressing a digital audio signal. The most common techniques are:                the wave-form coding methods, such as PCM (Pulse Code Modulation) and ADPCM (Adaptive Differential Pulse Code Modulation) coding,        the analysis—and synthesis-based parametric coding methods such as CELP (Code Excited Linear Prediction) coding, and        the methods of perceptual coding in subbands or by transform.These techniques process the input signal sequentially, sample by sample (PCM or ADPCM), or in blocks of samples called “frames” (CELP and transform coding). For all these coders, the coded values are then multiplexed in a bit stream which is transmitted over a transmission channel.        
Depending on the reliability and the type of transmission channel, disturbances may affect the transmitted signal and produce errors on the bit stream received by the decoder. These errors may occur in isolation in the bit stream or in bursts. This type of problem is encountered, for example, for transmissions over the mobile networks or over a wireless link of DECT (Digital Enhanced Cordless Telephone) type.
The consequence of an errored bit in the bit stream varies according to the coder used and also according to the type of parameter affected by the error. Obviously, the position (or weight) of the errored bit among the bits coding a parameter is also important. For example, for a scalar-type quantization, an error on the most significant bit MSB is more serious than an error on the least significant bit LSB.
In ITU-T standard G.711, each sample is represented on 8 bits by a PCM coding that uses an amplitude compression law in the form of a linear function by segments followed by a uniform scalar quantization. For each sample, G.711 generates 8 code bits which comprise 1 sign bit, 3 bits to identify the segment of the compression law and 4 bits to specify the location of a level on a given segment. An isolated error on the sign bit of a high amplitude sample causes a very clear discontinuity in the decoded signal, and listening quality is then highly degraded. Conversely, an isolated error on the least significant bit specifying the location of a level on a segment is practically inaudible. Since the quantization is of scalar type, the sensitivity with respect to bit errors increases with the position of the corrupted bits, the least sensitive bit being the least significant bit specifying the location of a level on a segment and the most sensitive bit being the most significant bit indicating a segment identifier. The sensitivity of the sign bit depends on the absolute value of the coded current sample.
Another example is given by CELP-type coding. In this case, the pitch parameter (or fundamental period) is very sensitive to bit errors. Generally, this parameter is coded by a scalar-type quantizer. Such is the case of ITU-T standard G.729, in which the signal is coded in frames of 10 ms divided into two subframes. The pitch T1 in the first 5 ms subframe is coded in absolute mode on 8 bits; the pitch T2 in the second 5 ms subframe is coded in relative mode in relation to T1.
A bit error reversing the first bit of the index associated with T1 can change the value of the pitch from T1=143 to T1=61⅔. Furthermore, T2 will also be badly decoded because its value is necessarily in the vicinity of T1 because of the relative coding. It can be seen, with this example, that a single errored bit can completely corrupt the decoding of a 10 ms frame. An error on a bit representing the index of the fixed CELP coding dictionary generally has far less impact and does not generate any audible degradation.
In a transform-based coder, such as, for example, the ITU-T G.722.1 standardized coder, or the proprietary TDAC coder from France Telecom (TDAC standing for Time Domain Aliasing Cancellation), the parameters are generally associated with two different information items: the spectral envelope and the “fine structure” of the spectrum (that is to say, the spectrum standardized by the spectral envelope). The short-term spectrum of the signal is typically divided into a certain number of subbands and the spectral envelope is defined as the RMS value of each of the subbands. This envelope is often coded by scalar quantization followed by a differential Huffman coding. Thus, the first quantization index is coded in absolute mode, and the other indices are coded in differential mode relative to the preceding subband. Because of the recursive nature of this envelope coding, a bit error in a given subband is propagated to the subsequent subbands until the end of the spectrum, and the decoded envelope therefore becomes “random” from the subband in which the error occurred. Furthermore, in certain variants in which the dynamic allocation of bits representing the fine structure of the spectrum depends on the decoded spectral envelope—which is the case with the G.722.1 and TDAC coders—the impact of the bit errors on the spectral envelope is also propagated to the decoding of the fine structure, which then becomes aberrant. The preceding examples show that the various bits of the bit stream generally require different protection levels and different strategies for concealing bit errors.
In the so-called “hierarchical” coding systems, also called “scalable”, the bit data obtained from the coding operation is divided up into successive layers. A bottom layer, also called “core”, is formed by the bit elements that are absolutely necessary to the decoding of the bit stream, and determining a minimum decoding quality. The subsequent layers are used to gradually enhance the quality of the signal obtained from the decoding operation, each new layer bringing new information which, when used by the decoder, supply a signal of increasing quality as output.
One of the particular features of the hierarchical coders-decoders is the possibility of intervening at any level of the transmission or storage chain to eliminate a portion of the bit stream without having to supply any particular indication to the coder or to the decoder. The decoder uses the bit information that it receives and produces a signal of corresponding quality.
The different layers of bit elements are generally hierarchically arranged (hence the name “hierarchical”), that is to say that if level 0 is called the core, then 1, 2, 3, etc. the subsequent layers, the decoding of the level 3 layer presupposes that the bit elements of the layers 0, 1 and 2 are also available.
The hierarchical coders are of particular interest in the contexts of transmission over networks with heterogeneous access: whether these are IP-type networks mixing fixed and mobile access, high bit rates (ADSL), low bit rates (56 k modems, GPRS) or involving terminals of variable capacities (cellphones, PC, etc.). The hierarchical coding makes it possible in practice to adapt the bit rate of a transmission without requiring transcoding. One example worth mentioning is access to audio content bases in which the audio samples are recorded with the highest bit rate of a hierarchical coder, the transmitted bit rate then being adapted to the capacity (or to the negotiated service quality) of the client being served. Another application is audio-video conferencing on heterogeneous access, in which the terminals that have high bit rate access can communicate at high bit rate even if one of the terminals in the conference does not have this capability, and do so without requiring transcoding.
The EV-VBR (Embedded Variable—Variable Bit Rate) coder currently being studied in the ITU-T question Q.9/16 (study period 2004-2008) is one example of a scalable speech coder. This coder has a bit stream structured in 5 layers associated with different bit rates (8, 12, 16, 24 and 32 kbits/s). A method of processing bit errors for EV-VBR is described in the RFC (Request For Comments) draft published on 12 Nov. 2007 by the IETF at the following web address: http://www.ietf.org/internet-drafts/draft-lakaniemi-avt-rpt-evbr-00.txt.
The payload format proposed in the above document consists of one or more transport blocks containing one or more layers, and the header of each transport block contains protection bits in the form of a CRC (Cyclic Redundancy Check) code for all the bits included in the corresponding transport block.
If an error is detected in a transport block that contains the layer Ln, only the lower layers (L1, . . . L(n−1)) will be decoded for the given frame. The detection of an errored bit in a transport block therefore leads to the same processing: rejection of the errored block and of the blocks above.
The probability of obtaining a decoding of all the layers of the frame is therefore very low with such a method, even with a low bit error rate.
Thus, in the case where each layer is sent in a different transport block, 5 8-bit CRC codes are transmitted in the payload which gives an additional bit rate of 2 kbit/s for a 20 ms frame. Even with a low bit error rate, the probability of being able to decode the maximum bit rate is low. For example, with a bit error rate of 0.1%, table 1 below gives the probability of being able to decode different configurations. This table also takes into account the bit errors on the CRC codes.
TABLE 1Layers decodedBit rate decodedProbabilityFrame erased 0 kbit/s15.5%L1 8 kbit/s7.1%L1 to L212 kbit/s6.5%L1 to L316 kbit/s11.0%L1 to L424 kbit/s9.3%L1 to L532 kbit/s50.6%
It can be seen that, even with a bit error rate BER as low as 0.1%, 15.5% of the frames will be considered invalid and a lost frame concealment algorithm will then be applied. With 15.5% of frames lost, the decoded quality will then be very mediocre.
In the case where all the layers are in one and the same transport block, the additional CRC code bit rate is only 0.4 kbit/s. However, with 0.1% bit error rate, close to half of the frames will be considered lost (49.4%). With 49.4% of frames lost, communication becomes impossible.
There is therefore a need to improve the quality of the decoded signal when there are bit errors. The present invention improves the situation.