In the usual way, the terms “telephone band” and “narrowband” refer to the frequency band from 300 hertz (Hz) to 3400 Hz and the term “wideband” is reserved for the band from 50 Hz to 7000 Hz.
Today there are many techniques for converting an audio-frequency (speech and/or audio) signal into a digital signal and for processing signals digitized in this way.
The most widely used techniques are “waveform coding” methods such as PCM or ADPCM coding, “parametric coding by analysis by synthesis” methods such as CELP (code excited linear prediction) coding, and “Perceptual coding in sub-bands or by transforms” methods. Narrowband CELP coding generally employs post-processing to enhance quality. This post-processing typically comprises adaptive post-filtering and high-pass filtering. The standard techniques for coding audio-frequency signals are described, for example, in “Speech Coding and Synthesis”, W. B. Kleijn and K. K. Paliwal editors, Elsevier, 1995. Only the techniques used in bidirectional transmission of audio-frequency signals are relevant here.
In conventional speech coding, the coder generates a fixed bitrate bit stream. This fixed bitrate constraint simplifies implementation and use of the coder and the decoder. Examples of such systems are G.711 coding at 64 kilo bits per second (kbps) and G.729 coding at 8 kbps.
In certain applications, such as mobile telephony, voice over IP, or communication over ad hoc networks, it is preferable to generate a variable bitrate bit stream, the bitrate values being taken from a predefined set. There are various multirate coding techniques:                multimode coding controlled by the source and/or the channel, as used in the AMR-NB, AMR-WB, SMV, or VMR-WB systems.        hierarchical coding, also known as “scalable” coding, which generates a bit stream that is referred to as hierarchical because it comprises a core bitrate and one or more enhancement layers. The G.722 system at 48 kbps, 56 kbps, and 64 kbps is a simple example of bitrate-scalable coding. The MPEG-4 CELP codec is bitrate-scaleable and bandwidth-scaleable (see T. Numura et al., A bitrate and bandwidth scalable CELP coder, ICASSP 1998).        multiple description coding (see A. Gersho, J. D. Gibson, V. Cuperman, H. Dong, A multiple description speech coder based on AMR-WB for mobile ad hoc networks, ICASSP 2004).        
In multirate coding, it is necessary to be sure that switching from one coding bitrate to another does not generate errors or artifacts.
Bitrate switching is simple if coding at all bitrates is based on the representation by the same coding model of an audio signal in the same bandwidth. For example, in the AMR-NB system, the signal is defined in the telephone band (300 Hz-3400 Hz) and coding relies on the ACELP (algebraic code excited linear prediction) model, except for the generation of comfort noise, which is nevertheless handled by an LPC (linear predictive coding) type model compatible with the ACELP model. Note that AMR-NB coding uses in the conventional way post-processing in the form of adaptive post-filtering and high-pass filtering, the adaptive post-filtering coefficients depending on the decoding bitrate. Nevertheless, no precautions are taken to manage any problems linked to the use of post-processing parameters varying according to the bitrate. In contrast, wideband CELP coding of AMR-WB type uses no post-processing, essentially for reasons of complexity.
Bitrate switching is even more problematic in bitrate-scalable and bandwidth-scalable audio coding. Coding is then based on models and bandwidths that differ according to the bitrate.
The basic concept of hierarchical audio coding is illustrated, for example, in the paper by Y. Hiwasaki, T. Mori, H. Ohmuro, J. Ikedo, D. Tokumoto, and A. Kataoka, Scalable Speech Coding Technology for High-Quality Ubiquitous Communications, NTT Technical Review, March 2004. In that type of coding, the bit stream comprises a base layer and one or more enhancement layers. The base layer is generated by a fixed low-bitrate codec called the “core codec”, guaranteeing the minimum coding quality. That layer must be received by the decoder to maintain an acceptable quality level. The enhancement layers are used to enhance quality. Although they are all sent by the coder, they may not all be received by the decoder. The main benefit of hierarchical coding is that it allows adaptation of the bitrate simply by truncating the bit stream. The number of layers, i.e. the number of possible truncations of the bit stream, defines the granularity of the coding. Coding is referred to as being of strong granularity if the bit stream comprises few layers, of the order of two to four layers, fine granularity coding allowing an increment of the order of 1 kbps.
Of greater interest here are hierarchical coding techniques that are bitrate-scalable and bandwidth-scalable with a telephone band CELP type core coder and one or more wideband enhancement layers. Examples of such systems are given in H. Taddéi et al., A Scalable Three Bitrate (8, 14.2 and 24 kbps) Audio Coder; 107th Convention AES, 1999 with a strong granularity of 8, 14.2 and 24 kbps, and in B. Kovesi, D. Massaloux, A. Sollaud, A scalable speech and audio coding scheme with continuous bitrate flexibility, ICASSP 2004 with fine granularity of 6.4 at 32 kbps, or MPEG-4 CELP coding.
Of the most pertinent references linked to the problem of bitrate switching in the context of bitrate-scalable and bandwidth-scalable audio coding, mention can be made of the international applications WO 01/48931 and WO 02/060075.
However, the techniques described in the above two documents deal only with problems of interworking between communications networks using telephone band and wideband coding.
In particular, international application WO 02/060075 describes an optimized decimation system for conversion from the wideband to the telephone band.
The method proposed in international application WO 01/48931 is a band extension technique that generates a pseudo-wideband signal from the telephone band signal, in particular by extracting a “spectral profile”. The known similar techniques of the prior art mainly address problems linked to wideband to telephone band switching by seeking to avoid band reduction by using a band extension technique with no transmission of information for generating a wideband signal from the received telephone band signal. Note that those methods do not really seek to control the transition between bandwidths and that they also have the drawback of relying on band extension techniques of quality that is highly variable, and that they therefore cannot guarantee stable output quality.