In recent years, music distribution services that distribute music data via the Internet or the like have come to be widely used. With such music distribution services, encoded data that is obtained by encoding music signals is distributed as music data. As an encoding method of music signals, an encoding method that suppresses file capacity of the encoded data and lowers the bit rate so to reduce the amount of time taken in the event of a download has become mainstream.
Such music signal encoding methods are largely divided into encoding methods such as MP3 (MPEG (Moving Picture Experts Group) Audio Layer 3) (International standard ISO/IEC 11172-3) and so forth, and encoding methods such as HE-AAC (High Efficiency MPEG4 AAC) (International standard ISO/IEC 14496-3) and so forth.
With the encoding method represented by MP3, music signal components of high frequency bands (hereafter called high frequencies) of approximately 15 kHz or higher that are difficult to be detected by the human ear are deleted, and the signal components of the remaining low frequency bands (hereafter called low frequencies) are encoded. This sort of encoding method will be hereafter called high frequency deleting encoding method. With this high frequency deleting encoding method, file capacity of the encoded data can be suppressed. However, high frequency sounds, while minimally, can be detected by humans, so if sound is generated and output from a music signal after decoding which is obtained by decoding the encoded data, deterioration of sound quality can occur, such as losing the realistic feeling which the original sound had, or the sound becoming muffled.
Conversely, with the encoding method represented by HE-AAC, feature information is extracted from high frequency signal components, and this is encoded together with low frequency signal components. This sort of encoding method will hereafter be called high frequency feature encoding method. With the high frequency feature encoding method, only feature information of the high frequency signal components are encoded as information relating to high frequency signal components, whereby encoding efficiency can be improved while suppressing deterioration of sound quality.
In decoding the encoded data that has been encoded with the high frequency feature encoding method, low frequency signal components and feature information are decoded, and high frequency signal components are generated from the low frequency signal components and feature information after decoding. Thus, by generating high frequency signal components from low frequency signal components, the technique to extend the frequency band of the low frequency signal components will hereafter be called a band extending technique.
As an application example of the band extending technique, there is post-processing after decoding the encoded data with the above-described high frequency deleting encoding method. In this the post-processing the frequency band of the low frequency signal components are extended by generating the high frequency signal components, lost by encoding, from the low frequency signal components after decoding (see PTL 1). Note that the method for frequency band extending in PTL 1 will hereafter be called the PTL 1 band extending method.
With the PTL 1 band extending method, a device estimates a high frequency power spectrum (hereafter called high frequency envelope, as appropriate) from the power spectrum of the input signal, with the low frequency signal components after decoding as the input signal, and generates high frequency signal components having the frequency envelope of the high frequency thereof from the low frequency signal components.
FIG. 1 shows an example of the low frequency power spectrum after decoding as the input signal and the estimated high frequency envelope.
In FIG. 1, the vertical axis represents power with logarithms, and the horizontal axis represents frequency.
A device determines the band of the low frequency end of the high frequency signal components (hereafter called extension starting band) from the type of encoding format relating to the input signal and information such as sampling rate, bit rate, and so forth (hereafter called side information). Next, the device divides the input signal serving as the low frequency signal components into multiple sub-band signals. The device finds multiple sub-band signals after dividing, i.e. an average for each group for a temporal direction of the power of each of multiple sub-band signals on the low frequency side (hereafter simply called low frequency side) from the extension starting band (hereafter called group power). As shown in FIG. 1, the device uses the average of respective group powers of multiple sub-band signals on the low frequency side as the power, and uses a point where the frequency is the frequency on the lower edge of the extension starting band as the origin point. The device estimates a linear line at a predetermined slope passing through the origin point as the frequency envelope on the higher frequency side from the extension starting band (hereafter simply called high frequency side). Note that the positions for the power direction of the origin point can be adjusted by the user. The device generates each of multiple sub-band signals on the high frequency side from multiple sub-band signals on the low frequency side so as to become frequency envelopes on the high frequency side as estimated. The device adds the multiple generated sub-band signals on the high frequency side so as to be the high frequency signal components, and further, adds the low frequency signal components and outputs this. Thus, the music signal after extension of the frequency band becomes much closer to the original music signal. Accordingly, music signals with higher sound quality can be played.
The above described PTL 1 band extending method has the advantages of being able to extend the frequency bands for music signals after decoding the encoded data thereof, with such encoded data having various high frequency deleting encoding methods and various bit rates.