As audio signal encoding methods, in general, there are well-known conversion encoding methods such as MP3 (Moving Picture Experts Group Audio Layer-3), AAC (Advanced Audio Coding), and ATRAC (Adaptive Transform Acoustic Coding).
FIG. 1 is a block diagram showing a configuration example of an encoding device encoding audio signals.
An encoding device 10 shown in FIG. 1 is formed by an MDCT (Modified Discrete Cosine Transform) part 11, a normalization part 12, a bit distribution part 13, a quantization part 14, and a multiplexing part 15, for example.
Sound PCM (Pulse Code Modulation) signal is input as an audio signal into the MDCT part 11 of the encoding device 10. The MDCT part 11 performs MDCT on the audio signal as a time domain signal to obtain a spectrum S0 as a frequency domain signal. The MDCT part 11 supplies the spectrum S0 to the normalization part 12.
The normalization part 12 extracts envelopes ENV by a plurality of spectra called quantization units from the spectrum S0, and supplies the same to the bit distribution part 13 and the multiplexing part 15. In addition, the normalization part 12 normalizes the spectrum S0 using the envelope ENV by quantization unit, and supplies a resultant normalized spectrum S1 to the quantization part 14.
If the envelope ENV is supplied from the normalization part 12, the bit distribution part 13 decides quantization information WL of the normalized spectrum S1 based on the envelope ENV, such that the bit count in a bit stream BS generated by the multiplexing part 15 falls within a desired range, according to a preset bit distribution algorithm. The quantization information WL is information indicative of quantization accuracy, and refers here to a quantization bit count. The bit distribution part 13 supplies the quantization information WL to the quantization part 14.
If there is feedback from the quantization part 14 on a bit count N in a quantized spectrum QS resulting from quantization of the normalized spectrum S1 based on the previous quantization information WL, the bit distribution part 13 determines based on the bit count N whether the bit count in the bit stream BS falls within a desired range. If determining that the bit count in the bit stream BS does not fall within a desired range, the bit distribution part 13 newly decides quantization information WL such that the bit count in the bit stream BS falls within a desired range. In addition, the bit distribution part 13 supplies the new quantization information WL to the quantization part 14.
In contrast, if determining that the bit count in the bit stream BS falls within a desired range, the bit distribution part 13 instructs the quantization part 14 for producing an output, and supplies the current quantization information WL to the multiplexing part 15.
The quantization part 14 quantizes the normalized spectrum S1 by quantization unit supplied from the normalization part 12, based on the quantization information WL supplied from the bit distribution part 13. The quantization part 14 supplies the bit count N in the resultant quantized spectrum QS to the bit distribution part 13. If an instruction for producing an output is issued from the bit distribution part 13, the quantization part 14 supplies the quantized spectrum QS based on the current quantization information WL to the multiplexing part 15.
The multiplexing part 15 multiplexes the envelope ENV supplied from the normalization part 12, the quantization information WL supplied from the bit distribution part 13, and the quantized spectrum QS supplied from the quantization part 14, thereby generating a bit stream BS. The multiplexing part 15 outputs the bit stream BS as a result of encoding.
As in the foregoing, the encoding device 10 generates not only the envelope ENV and the quantized spectrum QS but also the bit stream BS including the quantization information WL. This makes it possible to, at decoding of the bit stream BS, restore the normalized spectrum S1 from the quantized spectrum QS.
FIG. 2 is a diagram showing a configuration example of the bit stream BS generated by the multiplexing part 15 shown in FIG. 1.
As shown in FIG. 2, the bit stream BS is formed by a header Header including an upper limit value of the spectrum and the like, the envelope ENV, the quantization information WL, and the quantized spectrum QS.
As shown in FIG. 3, both the envelope ENV and the quantization information WL have values by quantization unit. Therefore, not only the quantized spectrum QS but also the envelope ENV and the quantization information WL are needed corresponding to the number of quantization units. Accordingly, assuming that a quantization unit count is designated as U, a bit count NWL required for transmission of the quantization information WL becomes a value of multiplication of the bit count in the quantization information WL and the quantization unit count U. As a result, the larger the quantization unit count U becomes, the more the bit count NWL increases.
In FIG. 3, k in [k] denotes the index of quantization units, and i an arbitrary value. In this arrangement, the index is set such that lower-frequency quantization units are given 1 or subsequent numbers.
In addition, the bit count for the envelope ENV by quantization unit is frequently determined in advance. Therefore, the bit distribution part 13 modifies the quantization information WL to change the bit count N in the quantized spectrum QS, thereby controlling the bit count in the bit stream BS to a determined value.
FIG. 4 is a block diagram showing a configuration example of a decoding device decoding a result of encoding by the encoding device 10 shown in FIG. 1.
A decoding device 20 shown in FIG. 4 is formed by a separation part 21, an inverse quantization part 22, an inverse normalization part 23, and an inverse MDCT part 24.
Input into the separation part 21 of the decoding device 20 is the bit stream BS as a result of encoding by the encoding device 10. The separation part 21 separates the envelope ENV and the quantization information WL from the bit stream BS. The separation part 21 also separates the quantized spectrum QS from the bit stream BS, based on the quantization information WL. The separation part 21 supplies the envelope ENV to the inverse normalization part 23 and supplies the quantization information WL and the quantized spectrum QS to the inverse quantization part 22.
The inverse quantization part 22 inversely quantizes the quantized spectrum QS based on the quantization information WL supplied from the separation part 21, and supplies a resultant normalized spectrum S1 to the inverse normalization part 23.
The inverse normalization part 23 inversely normalizes the normalized spectrum S1 supplied from the inverse quantization part 22, using the envelope ENV supplied from the separation part 21, and then supplies a resultant spectrum S0 to the inverse MDCT part 24.
The inverse MDCT part 24 performs inverse MDCT on the spectrum S0 as a frequency domain signal supplied from the inverse normalization part 23, thereby obtaining a sound PCM signal as a time domain signal. The inverse MDCT part 24 outputs the sound PCM signal as an audio signal.
As in the foregoing, the encoding device 10 includes the quantization information WL in the bit stream BS, which makes it possible to match an audio signal to be encoded and a decoded audio signal, even if the quantization information WL is arbitrarily modified at the encoding device 10. Therefore, the encoding device 10 can control the bit count in the bit stream BS using the quantization information WL. In addition, the encoding device 10 can solely be improved to set an optimum value in the quantization information WL, thereby achieving enhancement in sound quality.
However, when a large number of bits is needed for transfer of the quantization information WL, the bit count in the quantized spectrum QS relatively decreases, which leads to degradation in sound quality.
Accordingly, there is suggested an encoding method including dividing the quantization information WL into a fixed value uniquely determined at the encoding device and the decoding device and a differential value obtained by subtracting the fixed value from the quantization information WL, and encoding the differential value by a low bit count (for example, see Patent Document 1).