1. Technical Field
The present invention relates to decoding and encoding audio signals to reduce musical noise in audio signals and music signals (hereinafter referred to as audio signals and so forth).
2. Description of the Related Art
Music encoding technology that compresses audio signals at a low bitrate is an important technology in efficient usage of radio waves and the like in mobile communication. Further, there has been more demand for higher quality in phone call audio in recent years, and there is desire for a call service that has a real-life sensation. This can be realized by encoding audio signals and so forth of a wide frequency band at a high bitrate. However, this approach contradicts efficient use of radio waves and frequency bands.
As for a method to encode signals of a wide frequency band with high quality at a low bitrate, there is a technology where the spectrum of input signals is device into the two spectrums of a low-band portion and a high-band portion, with the high-band portion being substituted by a duplicate of the low-band portion. That is to say, the overall bitrate is reduced by substituting the low-band portion for the high-band portion (Japanese Unexamined Patent Application Publication (Translation of PCT Application) No. 2001-521648).
Based on this technology, there is a technology that, in light of the fact that the high-band spectrum has less deviation than the low-band spectrum, the low-band spectrum is normalized (smoothed) for each sub-band, after which correlation with the high-band spectrum is obtained. Accordingly, sound quality deterioration can be prevented by copying the low-band spectrum that has high peak features. However, this technology has a shortcoming in that, due to the low-band spectrum being expressed as a discrete pulse stream, the envelope of input signals in the method estimating the envelope of the discrete pulse stream is entirely different from the original envelope. Accordingly, a method has been proposed instead of this normalization method, where normalization is performed at the maximum amplitude value of discrete pulses, at each sub-band (International Publication No. 2013/035257).
FIG. 11 is the encoding device according to International Publication No. 2013/035257. In this encoding device, input signals are converted into frequency region signals by a time-frequency converter 1010 and output as an input signal spectrum, and the low-frequency region of the input signal spectrum is encoded at a core encoding unit 1020 and output as core encoded data. The core encoded data is then decoded and a core encoded low-frequency spectrum is generated, which is normalized by the maximum value of the amplitude at a sub-band amplitude normalization unit 1030 and a normalized low-band spectrum is generated. The band of the high-band portion where the correlation value as to the normalized low-band spectrum is greatest, and the gain between the normalized low-band spectrum at this band and the high-band portion of the input spectrum, are obtained, and these are encoded at an extended band encoding unit 1060, and output as extended band encoded data.
FIG. 12 illustrates a decoding device corresponding to this. The encoded data is divided into core encoded data and extended band encoded data at a separating unit 2010, the core encoded data is decoded at a core decoding unit 2020, and a core encoded low-band spectrum is generated. The core encoded low-band spectrum is subjected to the same processing as at the encoding device side, which is normalization by the largest value of the sample amplitude, thereby generating normalized low-band spectrum data. The normalized low-band spectrum data is then used to decode the extended band encoded data by an extended band decoding unit 2040, thereby generating the extended band spectrum.
Also disclosed is technology where switching is performed between the sub-band amplitude normalization unit 1030 that performs normalization at the largest value of the sample, and a spectrum envelope normalization unit 7020 that normalizes the envelope of the spectral power of the sample, in accordance with the intensity of the peak features, as illustrated in FIG. 13.
The technology of normalization at the largest value of the sample, described in International Publication No. 2013/035257, is effective in a case where the low-band spectrum is sparse, i.e., in a case where the amplitude value of just part of the samples is large and the amplitude value of the other samples is almost zero. That is to say, the technology according to International Publication No. 2013/035257 suppresses spectrums with extremely large amplitude from being generated even for sparse spectrums (homogenizing), and can yield normalized low-band spectrums with flat features (smoothing).