A voice coding technique and audio coding technique which compresses a voice signal or audio signal at a low bit rate are important for the effective utilization of a transmission path capacity of radio wave or the like in a mobile communication and a recording medium.
Voice coding for coding a voice signal includes schemes such as G726 and G729 standardized in the ITU-T (International Telecommunication Union Telecommunication Standardization Sector). These schemes target narrow band signals (300 Hz to 3.4 kHz) and can perform high quality coding at 8 kbits/s to 32 kbits/s. However, because such a narrow band signal has a frequency band as narrow as a maximum of 3.4 kHz, and as for quality, sound is muffled and lacks a sense of realism.
On the other hand, in the field of voice coding, there is a scheme which targets a wideband signal (50 Hz to 7 kHz) for coding. Typical examples of such a method include G722, G722.1 of the ITU-T and AMR-WB of the 3GPP (The 3rd Generation Partnership Project) and so on. These schemes can perform coding on a wideband voice signal at a bit rate of 6.6 kbits/s to 64 kbits/s. When the signal to be coded is a voice, a wideband signal has relatively high quality, but it is not sufficient when an audio signal is the target or when a quality with a high sense of realism is required for the voice signal.
Generally, when a maximum frequency of a signal is approximately 10 to 15 kHz, a sense of realism equivalent to that of FM radio is obtained and quality comparable to that of a CD is obtained if the frequency is on the order of 20 kHz. Audio coding represented by the layer 3 scheme and the AAC scheme standardized in MPEG (Moving Picture Expert Group) and so on is suitable for such a signal. However, in case of these audio coding schemes, the bit rate increases because the frequency band to be coded is widened.
The National Publication of International Patent Application No. 2001-521648 describes a technique of reducing an overall bit rate by dividing an input signal into a low-frequency band and a high-frequency band and substituting the high-frequency band by a low-frequency band spectrum as the method of coding a wideband signal at a low bit rate and with high quality. The state of processing when this conventional technique is applied to an original signal will be explained using FIGS. 1A to D. Here, a case where a conventional technique is applied to an original signal will be explained to facilitate explanations. In FIGS. 1A to D, the horizontal axis shows a frequency and the vertical axis shows a logarithmic power spectrum. Furthermore, FIG. 1A shows a logarithmic power spectrum of the original signal when a frequency band is limited to 0≦k<FH, FIG. 1B shows a logarithmic power spectrum when the band of the same signal is limited to 0≦k<FL (FL<FH), FIG. 1C shows a case where a spectrum in a high-frequency band is substituted by a spectrum in a low-frequency band using the conventional technique and FIG. 1D shows a case where the substituted spectrum is reshaped according to spectral outline information. According to the conventional technique, the spectrum of the original signal (FIG. 1A) is expressed based on a signal having a spectrum of 0≦k<FL (FIG. 1B), and therefore the spectrum of the high-frequency band (FL≦K<FH in this figure) is substituted by the spectrum of the low-frequency band (0≦k<FL) (FIG. 1C).
For simplicity, a case assuming that there is a relationship of FL=FH/2 is explained. Next, the amplitude value of the substituted spectrum in the high-frequency band is adjusted according to the spectrum envelope information of the original signal and a spectrum obtained by estimating the spectrum of the original signal is determined (FIG. 1D).