In recent years, music distribution services for distributing music data via the Internet or the like are becoming widely available. In these music distribution services, encoded data obtained by encoding a music signal is distributed as music data. As the technique for encoding a music signal, encoding techniques have become mainstream which limit the file size of encoded data to reduce the bit rate so that it does not take much time when downloading.
Roughly divided, as such music signal encoding techniques, there exist an encoding technique such as MP3 (MPEG (Moving Picture Experts Group) Audio Layer3) (International Standard ISO/IEC 11172-3), and an encoding technique such as HE-AAC (High Efficiency MPEG4 AAC) (International Standard ISO/IEC 14496-3).
In the encoding technique typified by MP3, the signal components of a music signal in the high frequency band (hereinafter, referred to as highband) of about 15 kHz or above which can be hardly perceived by human ears are cut, and the remaining signal components in the low frequency band (hereinafter, referred to as lowband) are encoded. Such an encoding technique is hereinafter referred to as highband-cutting encoding technique. This highband-cutting encoding technique makes it possible to limit the file size of encoded data. However, since sounds in the highband can be perceived, albeit slightly, by humans, when a sound is generated and outputted from a decoded music signal obtained by decoding the encoded data, it is often the case that sound quality degradation occurs, such as loss of the sense of realism that the original signal has, and muffled sound.
In contrast, in the encoding technique typified by HE-AAC, characteristic information is extracted from signal components in the highband, and encoded together with signal components in the lowband. Such an encoding technique is hereinafter referred to as highband-characteristics encoding technique. Since this highband-characteristics encoding technique encodes only characteristic information of the signal components in the highband as information related to the signal components in the highband, the encoding efficiency can be improved while suppressing degradation of sound quality.
In decoding encoded data encoded by this highband-characteristics encoding technique, the signal components in the lowband and characteristic information are decoded, and signal components in the highband are generated from the signal components in the lowband and the characteristic information that have been decoded. Hereinafter, the technique of extending the frequency band of the signal components in the lowband by generating the signal components in the highband from the signal components in the lowband in this way is referred to as band extension technique.
An example of application of this band extension technique is post-processing performed after decoding of data encoded by the highband-cutting encoding technique mentioned above. In this post-processing, the signal components in the highband lost by encoding are generated from the decoded signal components in the lowband, thereby extending the frequency band of the signal components in the lowband (see, for example, Patent Literature 1). It should be noted that the frequency band extension technique in Patent Literature 1 is hereinafter referred to as band extension technique in Patent Literature 1.
According to the band extension technique in Patent Literature 1, with the decoded signal components in the lowband as an input signal, the apparatus estimates the power spectrum of the highband (hereinafter, referred to as frequency envelope of the highband) from the power spectrum of the input signal, and generates signal components in the highband having the frequency envelope of the highband from the signal components in the lowband.
FIG. 1 shows an example of the power spectrum of the decoded lowband as an input signal, and the estimated frequency envelope of the highband.
In FIG. 1, the vertical axis represents power by logarithm, and the horizontal axis represents frequency.
The apparatus determines the band at the low end of signal components in the highband (hereinafter, referred to as extension start band) from information related to an input signal, such as the kind of encoding scheme, sampling rate, and bit rate (hereinafter, referred to as side information). Next, the apparatus divides the input signal as signal components in the lowband into a plurality of subband signals. The apparatus finds the average for each group (hereinafter, referred to as group power) with respect to the temporal direction of the respective powers of the plurality of divided subband signals, that is, the plurality of subband signals on the side lower than the extension start band (hereinafter, simply referred to as lowband side). As shown in FIG. 1, the apparatus obtains the average of the respective group powers of the plurality of subband signals on the lowband side as power, and obtains the point at which the frequency equals the frequency at the low end of the extension start band as a starting point. The apparatus estimates a first-order linear line with a predetermined slope passing through the starting point, as the frequency envelope on the side higher than the extension start band (hereinafter, simply referred to as highband side). It should be noted that the position of the starting point with respect to the power direction can be adjusted by the user. The apparatus generates each of a plurality of subband signals on the highband side from the plurality of subband signals on the lowband side, so that the estimated frequency envelope on the highband side is obtained. The apparatus adds the plurality of generated subband signals on the highband side to obtain signal components in the highband, and further adds the signal components in the lowband and outputs the result. Thus, the frequency-band-extended music signal becomes closer to the original music signal. Hence, it is possible to reproduce a music signal with higher sound quality.
The band extension technique in Patent Literature 1 described above has an advantage in that, for data encoded by various highband-cutting encoding techniques or at various bit rates, the frequency band can be extended with respect to the music signal obtained after decoding the encoded data.