It relates in particular to digital coding/decoding, of the logarithmic type implemented for example by the ITU-T G.711.1 coder.
The compression of signals in order to reduce the bit rate while maintaining a good quality of perception can make use of numerous techniques, including:                the PCM (Pulse Code Modulation) technique and variants thereof such as ADPCM (Adaptive Differential PCM),        the CELP (Code Excited Linear Prediction) techniques and        the techniques known as “by transformation” (for example of the MDCT (Modified Discrete Cosine Transformation) type).        
The PCM technique compresses the signal, sample by sample, with a given number of bits while the other types of techniques compress blocks of samples (or frames).
Coding/decoding according to ITU-T Recommendation G.711 is one of the most widely used for voice signals, both in traditional telephony (over the switched network) and over the Internet (voice over IP or VoIP). Such coding uses the technique known as “logarithmic PCM”.
The coding/decoding principle according to ITU-T Recommendation G.711 is summarized below.
The G.711 coder is based on 8-bit logarithmic compression at the sampling frequency of 8 kHz to give a bit rate of 64 kbit/s.
The principle of G.711 PCM coding is to perform compression of filtered signals in the 300-3400 Hz band by a logarithmic curve which makes it possible to obtain an almost constant signal-to-noise ratio for a broad dynamic range of signals. This involves coding by quantization with the quantization step varying with the amplitude of the sample to be coded:                when the incoming signal level is weak, the quantization step is small,        when the incoming signal level is strong, the quantization step is large.        
Two logarithmic PCM compression laws are used:                μ law (used in North America and Japan) and        A law (used in Europe and the rest of the world).        
G.711 coding according to the A law and G.711 coding according to the μ law use 8-bit encoding of incoming samples.
In practice, in order to facilitate the implementation of the G.711 coder, the logarithmic PCM compression has been approximated by a segmented curve.
In the A law, the 8 bits are distributed as follows:                1 sign bit,        3 bits to indicate the segment,        4 bits to indicate the position on the segment.        
The PCM coding/decoding principle is summarized with reference to FIG. 1. The PCM coder 11 comprises a quantization module QMIC 10 which receives the incoming signal S(z) at the input. The quantization index IMIC at the output of the quantization module 10 is transmitted over the transmission channel 12 to the decoder 14. The PCM decoder 14 receives the indices I′MIC sent from the transmission channel at the input (version possibly degraded by bit errors in IMIC) and performs inverse quantization by means of the inverse quantization module 13 (Q−1MIC) to obtain the decoded signal. Standardized ITU-T G.711 PCM coding (hereinafter called G.711) performs a compression of the amplitude of signals by means of a logarithmic curve prior to uniform scalar quantization, which makes it possible to obtain an almost constant signal-to-noise ratio for a broad dynamic range of signals. The quantization step in the original signal domain is therefore proportional to the amplitude of the signals. Successive samples of the compressed signal are quantized to 8 bits, i.e. 256 levels.
A quantization index IPCM (reference 15 in the example shown in FIG. 1) can therefore be considered to be the representation of a floating point number with 4 mantissa bits “Pos”, 3 exponent bits “Seg” and a sign bit “S”.
In the binary representation (sign and absolute value) of the sample to be coded over 16 bits, notating the least significant bit (LSB) of a sample b0, the exponent indicates the position pos of the first “1” among positions 14 to 8, the mantissa bits then being the next 4 bits and the sign bit being bit b15.
Thus, if position pos=14, exp=7; if pos=13, exp=6, . . . , if pos=8, exp=1.
If the first “1” is after position 8 (which corresponds to an absolute value of the sample to be coded less than or equal to 255), the exponent is 0.
In an example given in the table below where the first bit set at “1” is bit b10 (pos=10), the exponent exp is 3, and the 4 bits of the mantissa are the 4 bits in positions 9 to 6: m3m2m1m0(=b9b8b7b6).
b15b14b13b12b11b10b9b8 b7 b6b5b4 b3b2b1b0S00001m3m2m1m0
G.711 type encoding of a digital signal can be performed by comparison with the quantizer decision thresholds, a search by dichotomy making it possible to speed up the calculations. This search by comparison with thresholds requires the storage of decision thresholds and quantization indices corresponding to the thresholds. Another encoding solution, less costly in calculations, consists of eliminating the 4 least significant bits by right-shifting 4 bits, then adding 2048 to the shift value. The quantization index is finally obtained by simple reading of a table with 4096 entries, which, however, requires a larger read-only memory (ROM) than in the method presented above.
In order to avoid storing such tables, the quantization index can be determined with simple operations of low complexity. Such is the case in the G.711.1 encoder. A right shift of at least 4 bits is still applied. For samples to be coded over 16 bits, the smallest quantization step remains 16. The 4 least significant bits are still lost.
More generally, only the 4 bits following the first bit set at “1” are transmitted: the other bits are lost. Thus:                in the first segment (|x|≦255, exp=0), the 4 least significant bits are lost (mant=x>>4); and        in the other segments (2exp+7≦|x|<2exp+8, 0<exp<8), the (3+exp) least significant bits are lost (mant=x>>(3+exp)).        
The number of bits lost thus increases with the segment number up to 10 bits for the last segment (exp=7).
At the decoder, the decoded signal is obtained at the output of the inverse PCM quantizer (FIG. 1). If inverse quantization is implemented by a table, it consists simply of pointing by the index in the table of 256 decoded values. Decoding can thus be performed by simple operations of the same type.
The version according to the μ law is quite similar. The main difference is the addition of 128 to the values to ensure that, in the first segment, bit 7 is always equal to 1. Such an arrangement makes it possible to:                make the transmission of bit 7 unnecessary,        nevertheless increase the coding precision in the first segment (the quantization step being equal to 8 in the first segment as against 16 in the coding according to the A law),        process all the segments in an identical manner.        
Furthermore, there is addition of 4 (thus 128+4=132 in total) for rounding, thus producing the level “0” among the quantized values (since the A law has no level 0, the smallest values being ±8). The price of this better resolution in the first segment is the shifting of all the segments by 132. As for the A law, decoding is performed either by reading a table or by a set of algorithmically simple operations.
The signal-to-noise ratio (SNR) obtained by PCM coding is almost constant (˜38 dB) for a broad dynamic range of signals. The quantization step in the original signal domain is proportional to the amplitude of the signals. This signal-to-noise ratio is not sufficient to render the quantization noise inaudible. Furthermore, for weak signal levels (first segment), the SNR is very poor and may even be negative.
The quality of the G.711 coder is deemed to be good for narrow-band voice signals (sampling frequency 8 kHz). However, this quality is not excellent and the difference between the original signal to be coded and the decoded signal is perceptible with audible coding noise. In some applications, it is necessary to be able to increase the quality of the PCM coding in the 0-4000 Hz band by adding an optional layer, for example 16 kbit/s (thus 2 bits per sample). When the decoder receives this enhancement layer, it can enhance the quality of the decoded signal.
A G.711 coding/decoding principle known as “hierarchical” is presented below.
In the case of G.711 coding, a coder which is not very complex and not very costly in terms of memory, it is worthwhile considering a technique of hierarchical extension also with low complexity and reasonable memory requirements. Such a technique (as described for example in document US-2010/191538) consists of recovering the bits not transmitted in the mantissa of the PCM coding and transmitting them in the enhancement layer. In the event of reception of this layer, the decoder can decode the mantissa with greater precision. This technique, which makes it possible to obtain an increase in the SNR of 6 dB for each bit added per sample, consists of saving and transmitting in an enhancement bitstream the most significant bits among the bits lost during the initial PCM coding. For example, in the case of an enhancement layer at 16 kbit/s (2 bits per sample), the bits to be sent in this layer can be obtained by performing the right shift in two steps to save the 2 bits following the 4 bits of the mantissa.
The encoder sends in the extension layer the bits corresponding to the first (significant) bits of the bits which would otherwise be lost owing to the limited precision of logarithmic PCM coding. These extension bits make it possible to add supplementary positions to the segments “Seg”, thus enhancing the information on the samples of the greatest amplitudes. The decoder concatenates the extension bits received behind the base layer bits to obtain greater precision in the positioning of the decoded sample in the segment. At the decoder, the rounded value is adapted depending on the number of extension bits received.
This technique of recovering bits not transmitted in the mantissa of the PCM coding to transmit them in an enhancement layer is used to enhance the coding of the low band in the ITU-T G.711.1 coder.
The ITU-T G.711.1 coder, version 2008, is an extension of the PCM G.711 coding. This involves a hierarchical 64 to 96 kbit/s coder which is fully interoperable with the G.711 coder (A law or μ law). This standard meets the requirements for enhanced quality of VoIP applications. The functional diagram for G.711.1 coding/decoding is given in FIG. 2. The G.711.1 coder operates on audio signals sampled at 16 kHz on 5 ms blocks or frames (i.e. 80 samples at 16 kHz). The incoming signal S(z) is divided into two sub-bands SB1: [0-4 kHz] and SB2: [4 kHz-8 kHz] by filters 20 of the QMF (Quadrature Mirror Filters) type. The bit rate of 64 kbit/s (“Layer 0” compatible with G.711 (L0)) corresponds to quantization of the 0-4 kHz sub-band by the PCM (module 21) technique equivalent to the G.711 coding presented above, with shaping of the quantization noise. The next two layers (“Layers 1-2”) respectively code:                the 0-4 kHz low band using an enhancement technique (ENH) for the PCM coding (enhancement module 23 for “Layer 1” (L1)),        and the 4-8 kHz high-pass band by MDCT transform coding (module 22 for “Layer 2” (L2)),each with a bit rate of 16 kbit/s (80 bits per frame).        
The enhancement layer (“Layer 1”) of the low band makes it possible to reduce the quantization error of the core layer (“Layer 0”) by adding supplementary bits to each sample coded according to Recommendation G.711. As indicated above, it adds additional mantissa bits to each sample. The number of additional bits for the mantissa depends on the sample amplitude. Rather than allocate the same number of bits to enhance the precision of the mantissa coding of the samples, the 80 bits available in layer 1 (L1) to enhance the precision of the mantissa coding of the 40 samples are allocated dynamically, more bits being attributed to the samples with a significant exponent. Thus, while the budget for bits in the enhancement layer is 2 bits per sample on average (16 kbit/s), the enhancement signal has a resolution of 3 bits per sample and, with this adaptive allocation, the number of bits allocated to a sample varies depending on its exponent value from 0 to 3 bits.
A description is given below of how the coding/decoding of enhancement layer “1” (L1) of the low band of the G.711.1 coder operates.
Encoding with adaptive bit allocation takes place in two phases:                a first phase of generation of the bit allocation table,        followed by a phase of dynamic multiplexing of the enhancement signal.        
The procedure is common both A and μ laws.
The bit allocation table is generated using the exponents of the 40 samples, exp(n), with n=0 to n=39. The procedure for generating the bit allocation table itself comprises two steps:                in a first step, an exponent map and a table of exponent index counters are calculated from the exponents and        in a second step, the bit allocation table is calculated.        
An exponent map, map(j,n), j=0, . . . , 9, n=0, . . . , 39, and a table of exponent index counters, cnt(j), j=0, . . . , 9, are calculated according to the following operations, with reference to FIG. 3 in which step S11 consists of resetting the table of exponent index counters to zero: cnt(j)=0, j=0, . . . 9 and a loop index n on the 40 samples.
Next, for each of the 40 samples (with a loop on n=0, . . . , 39),                (in a loop on i=0, . . . , 2)        step S12 corresponds to a calculation of three exponent indices: iexpn=exp(n)+i, i=0, 1, 2;        step S13 corresponds to an update of the exponent map: map(iexpn,cnt(iexpn))=n        step S14 corresponds to an incrementation of the table of exponent index counters: cnt(iexpn)=cnt(iexpn)+1        
At the end of this procedure, the table of exponent index counters cnt(j) indicates the number of samples with the same exponent index and the exponent map contains the indices of the samples using a given exponent index.
Thus, cnt(6) is the number of samples which could have 6 as the exponent index, i.e. the number of samples with exponents equal to 6 (i=0), 5 (i=1), or 4 (i=2), map(6,j), j=0, cnt(6)−1, then containing the indices of these cnt(6) samples.
The bit allocation table, b(n), n=0, . . . , 39, is then calculated as follows, with reference to FIG. 4, on which step S21 consists of resetting:                the bit allocation table, b(n) to zero: b(n)=0, n=0, . . . , 39,        the number of bits to be allocated Nb to 80: Nb=80        the exponent index, iexp, to its maximum value (here, 9=7+2): iexp=9        
After this first resetting step, subsequent steps S22 to S25 are iterated until all the bits are allocated:                S22: Comparison of the number of bits to be allocated Nb with the exponent index counter, cnt(iexp), to determine the smaller of the two: min(cnt(iexp), Nb)        S23: Incrementation by 1 of the bit allocation of the min(cnt(iexp), Nb) first samples with indices corresponding to the exponent index iexp in the exponent map:b(n)=b(n)+1,                    for all n=map(iexp j), j=0, . . . , min(cnt(iexp), Nb)−1,                        S24: Update of the number of bits to be allocated:Nb=Nb−min(cnt(iexp),Nb)        S25: Test to verify whether all the bits have been allocated (Nb=0).                    If Nb>0, decrementation by 1 of the exponent index: iexp=iexp−1 and passing to the next iteration by going to step S22.            Otherwise (Nb=0: all the bits have been allocated) and the calculation of the bit allocation table is finished.                        
These procedures are described in particular in documents EP-2187387 and EP-2202728.
The bit allocation table, b(n), n=0, . . . , 39, gives for each sample the number of most significant bits in the extension layer. The enhancement codes are thus extracted, then sequentially multiplexed in the bitstream of the enhancement layer. Here, 3 bits following the 4 bits of the mantissa are saved, rather than just 2 bits in the G711 hierarchical coder with fixed bit allocation. Then, after calculating the adaptive bit allocation table, the b-bit extension signal (b=0, 1, 2 or 3) is extracted while only retaining the b most significant bits. For this, depending on the number b of bits allocated, a right shift of 3−b bits is performed.
In comparison with multiplexing of the enhancement signal with fixed bit allocation, multiplexing with adaptive bit allocation is more complex. Whereas, in the case of fixed bit allocation with 2 enhancement bits per sample, the composition of the bitstream of this enhancement layer in bytes of 8 bits is simple, this is not the case for dynamic allocation.
With fixed allocation of 2 bits per sample, each of the 10 bytes of the enhancement layer is constituted by the 2 enhancement bits of 4 consecutive samples. Thus, the 8 bits (b7b6b5b4b3b2b1b0) of the first byte are:                the 2 enhancement bits of sample 0 (b7b6),        followed by the 2 enhancement bits of sample 1 (b5b4),        then by the 2 enhancement bits of sample 2 (b3b2),        and finally by the 2 enhancement bits of sample 3 (b1b0).        
More generally, the 8 bits (b7b6b5b4b3b2b1b0) of the ith byte (i=0, . . . 9) are:                the 2 enhancement bits of sample 4i (b7b6),        followed by the 2 enhancement bits of sample 4i+1 (b5b4),        then by the 2 enhancement bits of sample 4i+2 (b3b2),        and finally by the 2 enhancement bits of sample 4i+3 (b1b0).        
In decoding, the bit allocation table is reconstituted according to the same principle described above, the exponent values being available to the encoder and to the decoder. Then, the enhancement signal is reconstituted from the bitstream of the enhancement layer which uses the bit allocation table.
However, there are disadvantages inherent in the current coding/decoding of the low-band enhancement layer of the G.711.1 coder.
In comparison with fixed allocation (typically 2 bits per sample), dynamic bit allocation makes it possible to allocate a number of enhancement bits dependent on the amplitudes of the samples to be coded. However, this adaptive allocation is markedly more complex than a fixed allocation. It requires more random access memory and also more calculations, without taking account of the number of instructions to be stored in read only memory.
For example, in the case of a G.711.1 codec where 80 bits are allocated to 40 samples and the number of bits allocated varies from 0 to 3 bits, dynamic bit allocation, in comparison with fixed bit allocation, requires the following tables to be stored:                Bit allocation table: 40 words of 16 bits        Exponent map table: 400 (=10×40) words of 16 bits        Exponent counter table: 10 words of 16 bits        
Thus, in comparison with fixed allocation in the case of G.711.1, dynamic allocation requires of the order of 450 words of memory.
As regards the complexity of calculation, calculation of the exponent map and the associated exponent counters table, as well as the iterative procedure for dynamic bit allocation, require complex addressing, numerous memory accesses and tests.
The initial resetting of the exponent counter table requires for example 10 memory accesses, then, a loop is performed on 40 samples, as follows:                a) memory access to obtain iexpn the exponent of sample n        b) addressing in the exponent map to point to the address adr_map of map(iexpn,0)        c) addressing in the counter table to point to the address adr_cnt of cnt(iexpn): adr_cnt=cnt+iexpn        d) then a loop is performed on 3 indices (i=0, 1, 2)                    i. reading (memory access) the value stored in memory at address adr_cnt: *adr_cnt            ii. addressing in the exponent map to point to the address adr_mapi of map(iexpn, *adr_cnt): adr_mapi=adr_map+*adr_cnt            iii. writing (storage) at this address n: *adr_mapi=n            iv. incrementing by 1 the value of the exponent counter contained at address adr_cnt: *adr_cnt=*adr_cnt+1            v. incrementing by 1 the address adr_cnt: adr_cnt++            vi. incrementing by 40 the address adr_map: adr_map=adr_map+40                        
Operations a), b), c) and d) are performed 40 times; the operations in the internal loop d) (operations i to vi) are performed 120 (=40×3) times.
Although the complexity of regular addressing (with a constant increment as in operations v and vi) is considered negligible, such is not the case for certain less regular addressing operations. Although the addressing in table cnt is not very costly (addition of step c)), the addressing in the exponent map table is relatively complex.
In particular, pointing to the address of an element map(j,n) in a two-dimensional table (10 and 40) requires calculation of 40j+n. If the addition of n has a significance of “1”, multiplication by 40 costs “3”. In order to reduce the cost of this addressing, it is possible to store partial addresses, but at the cost of an increase in the size of the random-access memory.
Calculation of the bit allocation table is also complex, in that resetting the bit allocation table to zero and the number of bits to be allocated to 80 requires 41 memory accesses. Addressing in the exponent counter tables and in the exponent map is also reset to point respectively to addresses adr_cnt of cnt(9) and adr_map of map(9,0) (adr_cnt=cnt+9, adr_map=map+40*9).
Then, a loop on 10 exponent index values (iexp decrementing from 9 to 0) executes the following operations:                e) Test to determine whether the number of bits remaining to be allocated Nb is smaller than the current value of the exponent counter, *adr_cnt (=cnt(iexp));                    if so, the variable Nb is stored in the variable bit_cnt;            if not, it is the value *adr_cnt which is stored.                        This storage is counted as a memory access.        f) Resetting addressing in the exponent map to point to address adr_mapi of map(iexp, 0): adr_mapi=adr_map        g) then a loop on bit_cnt indices (i=0, 1, bit_cnt-1) is performed to increment by 1 bit the bit allocation of each of the bit_cnt first samples the indices of which are stored from map(iexp,0) to map(iexp, bit_cnt-1). In this internal loop, the following operations are performed:                    reading (memory access) index n of the sample stored in memory at address adr_mapi (i.e. address of the element map(iexp,i)): n=*adr_mapi            addressing in the bit allocation table to point to address adr_b of the element b(n): adr_b=b+n            increment by 1 (addition) of the bit allocation of sample n and storage of the incremented value (memory access): *adr_b=*adr_b+1 (b(n)=b(n)+1)            increment by 1 of the address in the exponent map: adr_mapi++(adr_mapi now points to the address of the element map(iexp,i+1)                        h) Updating the number of bits remaining to be allocated (subtraction): Nb=Nb−bit_cnt        i) Test to determine whether all the bits have been allocated:                    If so (Nb=0), leave the loop on iexp: bit allocation is finished            If not (Nb>0), decrement by 40 the address adr_map: adr_map=adr_map−40 to point to the address of the element map(iexp−1, 0) and perform steps e) to i) for the next exponent (iexp=iexp−1).                        
The number of times the external loop is performed (operations e to i) depends on the exponent map (and thus on the distribution of the exponents of samples to be coded). The variability of this number makes the loop more complex. If the number of repetitions of a loop is not known before entering the loop, it is necessary at the end of each iteration to perform a test (step i). The number of times the internal loop operations are conducted is equal to the total number of bits to be allocated (80 in the case of G.711.1). In particular, it is still the case that these operations consist of allocating 1 bit at a time.
As regards multiplexing, as mentioned above, with adaptive bit allocation, multiplexing of the bits in the enhancement layer in the bitstream is far more complex than with fixed bit allocation.
With dynamic allocation, the composition of the bitstream is not regular: the number of samples the bits of which make up a byte varies and the enhancement bits of one and the same sample can be on different bytes. In comparison with fixed multiplexing, adaptive multiplexing is far more complex and requires, in the above example, 40 loop tests (“IF” type) and 80 supplementary additions/subtractions. The final composition of the 10 bytes also requires 20 subtractions and 20 bit offsets.
Thus, although the G.711 coding technique has been enhanced in quality by the introduction of a hierarchical extension with dynamic bit allocation, this dynamic allocation requires a large number of operations and far more memory. An objective of ITU-T standardization of the hierarchical extension to G.711 is to achieve low complexity.