In the field of compression, coders use the properties of the signal such as its harmonic structure, utilized by long-term prediction filters, as well as its local stationarity, utilized by short-term prediction filters. Typically, the speech signal can be considered to be a stationary signal for example over time intervals of from 10 to 20 ms. It is therefore possible to analyze this signal by blocks of samples called frames, after appropriate windowing. The short-term correlations can be modeled by time-varying linear filters whose coefficients are obtained with the aid of linear predictive analysis on frames, of short duration (from 10 to 20 ms in the aforementioned example).
LPC linear predictive coding is one of the most widely used digital coding techniques, in particular in the mobile telephony sector, in particular in the 3GPP AMR-WB coder such as described in the document “3GPP TS 26.190 V10.0.0 (2011-03) 3rd Generation Partnership Project; Technical Specification Group Services and System Aspects; Speech codec speech processing functions; Adaptive Multi-Rate—Wideband (AMR-WB) speech codec; Transcoding functions (Release 10)”. LPC coding consists in performing an LPC analysis of the signal to be coded so as to determine an LPC filter, and then in quantizing this filter, on the one hand, and in modeling and coding the excitation signal, on the other hand. This LPC analysis is performed by minimizing the prediction error on the signal to be modeled or a modified version of this signal. The autoregressive model of linear prediction of order P consists in determining a signal sample at an instant n through a linear combination of the P past samples (principle of prediction). The short-term prediction filter, denoted A(z), models the spectral envelope of the signal:
      A    ⁡          (      z      )        =            ∑              i        =        0            P        ⁢                  -                  a          i                    ×              z                  -          i                    
The difference between the signal S(n) at the instant n and its predicted value {tilde over (S)}(n) is the prediction error:
      e    ⁡          (      n      )        =                    S        ⁡                  (          n          )                    -                        S          ~                ⁡                  (          n          )                      =                  S        ⁡                  (          n          )                    +                        ∑                      i            =            1                    P                ⁢                              a            i                    ⁢                      S            ⁡                          (                              n                -                i                            )                                          
The calculation of the prediction coefficients is performed by minimizing the energy E of the prediction error given by:
  E  =                    ∑        n                                      ⁢                        e          ⁡                      (            n            )                          2              =                  ∑        n                                      ⁢                        (                                    S              ⁡                              (                n                )                                      +                                          ∑                                  i                  =                  1                                P                            ⁢                                                a                  i                                ⁢                                  S                  ⁡                                      (                                          n                      -                      i                                        )                                                                                )                2            
The way to solve this system is well known, in particular with the Levinson-Durbin algorithm or the Schur algorithm.
The coefficients ai of the filter must be transmitted to the receiver. However, as these coefficients do not have good quantization properties, transformations are preferably used. Among the most common may be cited:                the PARCORs coefficients (the abbreviation standing for “PARtial CORrelation”) consisting of reflection coefficients or coefficients of partial correlation,        the Logarithmic Area Ratios LAR of the PARCORs coefficients,        the Line Spectral Pairs LSP.        
The LSP coefficients are now the most widely used for the representation of the LPC filter since they lend themselves well to vector quantization.
Other equivalent representations of the LSP coefficients exist:                the LSF coefficients (the abbreviation standing for “Line Spectral Frequencies”),        the ISP coefficients (the abbreviation standing for “Immittance Spectral Pairs”),        or else the ISF coefficients (the abbreviation standing for “Immittance Spectral Frequencies”).        
The LPC linear predictive coding technique allows a substantial reduction in bitrate in favor of high audio playback quality. However, linear predictive coding lends itself poorly to certain applications for processing coded audio signals, such as the detection of a predetermined frequency band in such coded signals.
It is appropriate to recall that such detection may turn out to be useful, or indeed necessary, having regard at the present time, to the growing multiplicity of audio compression formats.
Indeed, to offer mobility and continuity, modern and innovative multimedia communication services must be able to operate under a great variety of conditions. The dynamism of the multimedia communication sector and the heterogeneity of networks, access and terminals have brought about a proliferation of compression formats whose presence in the communication chains requires several codings either in cascade (transcoding), or in parallel (multi-format coding or multi-mode coding).
In addition to the linear predictive coding technique mentioned hereinabove, there exist other audio compression techniques for reducing bitrate while maintaining good quality, such as for example:                the PCM “Pulse Code Modulation” techniques,        and the frequency transform based techniques such as those of the MDCT type (the abbreviation standing for “Modified Discrete Cosine Transformation”) or FFT type (the abbreviation standing for “Fast Fourier Transform”).        
Certain coders combine various coding techniques. Thus in the document Combescure P., Schnitzler J., Fischer K., Kircherr R., Lamblin C., Le Guyader A., Massaloux D., Quinquis C., Stegmann J., Vary P., A 16, 24, 32 kbit/s wideband speech codec based on ATCELP, in IEEE International Conference on Acoustics, Speech, and Signal Processing, 1999 (ICASSP99), Page(s): 5-8 vol. 1, it is proposed to combine a frequency transform technique of MDCT type and a linear predictive coding technique of CELP type (the abbreviation standing for “Code Excited Linear Prediction”) to code wideband signals, the switch between the two technologies being controlled by classification of the signal.
Transcoding is necessary when in a transmission chain, a compressed signal frame emitted by a coder can no longer continue on its path, in this format. Transcoding makes it possible to convert this frame into another format compatible with the rest of the transmission chain. The most elementary solution (and the most common at the present time) is the end-to-end placement of a decoder and of a coder. The compressed frame arrives in a first format, and it is then decompressed. The decompressed signal is then compressed again into a second format accepted by the rest of the communication chain. This cascading of a decoder and of a coder is called a tandem.
In the particular case of a tandem, coders respectively coding different frequency bands can be placed in cascade. Thus, a coder operating in a wide frequency band [50 Hz-7 kHz], also called the WB band (the abbreviation standing for “WideBand”) may be required to code an audio content operating in a more restricted frequency band than the wideband. For example, the content to be coded by a 3GPP AMR-WB coder such as mentioned above, although sampled at 16 kHz, may in fact only be in telephone band if such a content has been coded previously by a coder operating in a narrow frequency band [300 Hz, 3400 Hz], also called the NB band (the abbreviation standing for “NarrowBand”). It may also happen that the limited quality of the acoustics of the emitter terminal does not make it possible to cover the whole of the wideband.
It is therefore apparent that the audio band of a stream coded by a coder operating on signals sampled at a given sampling frequency may be much more restricted than that actually supported by the coder.
Among the audio signal processing applications advantageously utilizing the knowledge of the audio frequency band of the content to be processed may be cited:                audio signals classification,        automatic speech recognition,        Speech To Text (STT) conversion of radio or television transmissions containing narrowband passages,        digital watermarking,        non-intrusive analysis of streams by probes placed on the media plane in networks, thereby making it possible in particular to detect a change of band of the transported contents and optionally the duration of said contents in a given band, within the network subsequent to this change of band,        the display on a mobile terminal of an “HD Voice” logo (the abbreviation standing for “High-Definition Voice”), such as approved by the GSMA in August 2011 for mobile terminals and networks and such as described in the document available at the Internet address: http://www.gsm.org/membership/industry_logos.htm,        the indicator of numbers of calls that have been left in wideband on mobile voice messaging.        
Among the known schemes for detecting the frequency band of a digital audio signal, there are those operating in the (original or decoded) signal domain, and those operating in the coded domain.
The detection of the frequency band in the signal domain relies on a spectral analysis of the digital audio signal. By way of example, such detection is implemented in the 3GPP2 VMR-WB codec such as described in the document 3GPP2 C.S0052-0 (Jun. 11, 2004) “Source-Controlled Variable-Rate Multimode Wideband Speech Codec (VMR-WB) Service Option 62 for Spread Spectrum Systems”, in order to detect a narrowband audio content which has been oversampled at the sampling frequency of 16 kHz specific to this codec.
The aforementioned codec undertakes a spectral analysis of the temporal signal (after sub-sampling at 12.8 kHz, high-pass filtering and pre-emphasis) by performing two FFT frequency transforms on 256 samples per frame, to obtain two sets of spectral parameters per frame. The spectrum obtained by the FFT analysis is divided into 20 critical bands, the number of frequency bins in these 20 bands being MCB={2, 2, 2, 2, 2, 2, 3, 3, 3, 4, 4, 5, 6, 6, 8, 9, 11, 14, 18, 21}. Next, the energy in each critical band is calculated, according to the formula:
                    E        CB            ⁡              (        i        )              =                  1                                            (                                                L                  FFT                                /                2                            )                        2                    ⁢                                    M              CB                        ⁡                          (              i              )                                          ⁢                        ∑                      k            =            0                                                              M                CB                            ⁡                              (                i                )                                      -            1                          ⁢                  (                                                    X                R                2                            ⁡                              (                                  k                  +                                      j                    i                                                  )                                      +                                          X                I                2                            ⁡                              (                                  k                  +                                      j                    i                                                  )                                              )                      ,          ⁢      i    =    0    ,  …  ⁢          ,  19the index ji is the index of the first bin of the band i
      (                  j        i            =                                    ∑                          k              =              0                                      i              -              1                                ⁢                                    M              CB                        ⁡                          (              k              )                                      +        1              )    ,and XR(k) and XI(k) being the real and imaginary parts of the FFT spectrum.
In order to correctly process the oversampled narrowband signals, a detection algorithm is applied to detect such signals. It consists in testing the smoothed energy level in the last two bands.
As a variant to the aforementioned FFT transform, other frequency transforms can be used, such as for example the MDCT transform (the abbreviation standing for “Modified Discrete Cosine Transformation”).
The detection of the frequency band in the coded domain can rely for its part on prior decoding of the coded signal and then on the application of the techniques of spectral analysis hereinabove such as used in the signal domain to analyze the original audio contents (uncoded or before coding). However, the decoding increases the complexity and the delay of the processing. In many applications, it is therefore desirable, in order to avoid these problems of complexity and/or of delay, to extract the characteristics of the signal without performing a complete decoding of the signal.
Several analysis techniques in the coded domain have been proposed. They relate to transform or sub-band based coders such as the MPEG coders (e.g. MP3, AAC, etc.).
In such coders, the coded stream does indeed comprise coded spectral coefficients, such as for example, the MDCT coefficients in the MP3 coder. Thus in the document Liaoyu Chang, Xiaoqing Yu, Haiying Tan, Wanggen Wan, Research and Application of Audio Feature in Compressed Domain, IET Conference on Wireless, Mobile and Sensor Networks, 2007. (CCWMSN07), Page(s): 390-393, 2007, it is proposed, rather than to decode the entirety of the coded audio signal, to decode solely the MDCT coefficients which by themselves make it possible to determine the spectral characteristics of the coded signal. The bandwidth BW of the coded audio content is thus determined on the basis of these MDCT coefficients with the aid of the following expression:BW=Max{i|SMRSi≧TSRMS}−Min{i|SMRSi≦TSRMS}where SMRSi is the square root of the energy of the ith band (
            SMRS      i        =                            1                      N            i                          ⁢                              ∑            j                                                          ⁢                      S                          i              ,              j                        2                                ,where Si,j represents the jth coefficient of the ith band and Ni, the number of coefficients in the ith band) and TSRMS a threshold.
The schemes for detecting the frequency band of a digital audio signal which have just been described rely mainly on a frequency analysis of the spectrum of the signal. In the case where the audio content has been coded by a frequency transform, the detection of the audio frequency band in the coded content advantageously utilizes the spectral information contained in the coded binary stream while not completely decoding the signal. This noticeably reduces the complexity of the detection by eliminating the expensive operations required by the complete decoding and the spectral analysis (based on FFT or on MDCT) of the coded audio signal.
Now, though transform based compression technologies are very widespread in audio coding (high bitrates, high sampling frequency), such is not the case in speech coding where the coding methods predominantly use linear predictive compression technologies such as described previously and which nevertheless rely on a modeling of the spectral envelope of the signal by the linear-prediction coefficients of the short-term LPC filter and the diverse transformations (e.g.: LSP) used for the quantization.
A solution for determining the audio frequency band of a signal coded by a linear predictive coder consists in decoding the signal and then in applying to it a scheme for detecting frequency band in the signal domain, such as the one described hereinabove. However, such a solution turns out to be very expensive as regards complexity of calculations, therefore giving rise to undesired consumption of the resources of the central processing unit CPU. The complexity of calculations is brought about by the application of the FFT or MDCT frequency transforms which remain complex operations.
Moreover, though in some of the aforementioned audio signal processing applications benefiting from the knowledge of the audio frequency band, the decoded signal is available, such as for example the application consisting in displaying on a mobile terminal of an “HD Voice” logo, such is not the case for all applications. Thus, for example, in the application regarding indicator of numbers of calls that have been left in wideband on mobile voice messaging, the complexity of the decoding must then be added to the complexity of the time-frequency transform and of the detection of the audio band on the basis of the energies per band. Now, in a coder, such as in particular the aforementioned AMR-WB coder, the decoding represents 20% of the coder's total complexity, itself estimated at around 40 WMOPS (the abbreviation standing for “Weighted Millions of Operations Per Second”).
As indicated previously, certain coders combine linear predictive coding techniques with other compression techniques such as for example frequency transform based coding techniques of MDCT type. It would then be possible to make do with performing the detection only on the audio signal blocks coded by a frequency transform technique, using a prior art scheme for these blocks. However, this solution would be detrimental to the responsivity of the detection since according to the type of the content and/or the bitrate, linear predictive coding can be used predominantly.