As audio signal encoding techniques, the following transform coding techniques have been generally well known: MP3 (Moving Picture Experts Group Audio Layer-3), AAC (Advanced Audio Coding), and ATRAC (Adaptive Transform Acoustic Coding).
In such an encoding technique, results of encoding do not include a higher frequency spectrum containing a large amount of information, but include only the envelope of the higher frequency spectrum, so as to achieve a higher encoding efficiency. At the time of decoding in such a case, a lower frequency spectrum is duplicated by parallel translation, replication, or the like, to generate a higher frequency spectrum. Only the envelope of the generated higher frequency spectrum is made closer to the envelope of the original higher frequency spectrum contained in the results of encoding, to improve auditory quality. Such a decoding technique is called a band extension technique, and has been already known to the general public.
FIG. 1 is a block diagram showing an example structure of an encoding apparatus that has only the envelope of the higher frequency spectrum in the results of encoding.
The encoding apparatus 10 of FIG. 1 includes a MDCT (Modified Discrete Cosine Transform) unit 11, a quantizing unit 12, and a multiplexing unit 13. The encoding apparatus 10 is the same as a generally known transform coding apparatus, except that a higher frequency spectrum SP-H is not included in the results of encoding. For ease of explanation of the drawings, the quantizing unit 12 not only performs quantization but also extracts and normalizes objects to be quantized.
Specifically, the MDCT unit 11 of the encoding apparatus 10 performs a MDCT on a PCM (Pulse Code Modulation) signal that is an audio time-domain signal that is input to the encoding apparatus 10. By doing so, the MDCT unit 11 generates a spectrum SP that is a frequency domain signal. The MDCT unit 11 supplies the generated spectrum SP to the quantizing unit 12.
The quantizing unit 12 extracts envelopes from the higher frequency spectrum SP-H that is the higher frequency components of the spectrum SP supplied from the MDCT unit 11, and from a lower frequency spectrum SP-L that is the lower frequency components of the spectrum SP. The quantizing unit 12 quantizes a higher frequency envelope ENV-H that is the extracted envelope of the higher frequency spectrum SP-H, and a lower frequency envelope ENV-L that is the extracted envelope of the lower frequency spectrum SP-L. The quantizing unit 12 supplies the quantized higher frequency envelope ENV-H and lower frequency envelope ENV-L to the multiplexing unit 13. In this specification, the names (such as SP-L and SP-H) of signals are the same before and after quantization and encoding, for ease of explanation.
The quantizing unit 12 normalizes the lower frequency spectrum SP-L, using the lower frequency envelope ENV-L. The quantizing unit 12 quantizes the normalized lower frequency spectrum SP-L, and supplies the resultant lower frequency spectrum SP-L to the multiplexing unit 13.
As described above, the quantizing unit 12 has the envelope and the normalized spectrum included in the results of encoding of the lower frequency components of the spectrum SP, but has only the envelope included in the results of encoding of the higher frequency components. Accordingly, the encoding efficiency becomes higher.
The multiplexing unit 13 multiplexes the lower frequency envelope ENV-L, the lower frequency spectrum SP-L, and the higher frequency envelope ENV-H, which are supplied from the quantizing unit 12. The multiplexing unit 13 outputs the resultant bit stream. This bit stream is recorded on a recording medium (not shown), or is transferred to a decoding apparatus.
FIG. 2 is a flowchart for explaining an encoding operation to be performed by the encoding apparatus 10 of FIG. 1. This encoding operation is started when an audio PCM signal is input to the encoding apparatus 10, for example.
In step S11 of FIG. 2, the MDCT unit 11 performs a MDCT on a PCM signal that is an audio time-domain signal that is input to the encoding apparatus 10, and generates the spectrum SP that is a frequency domain signal. The MDCT unit 11 supplies the generated spectrum SP to the quantizing unit 12.
In step S12, the quantizing unit 12 extracts envelopes from the higher frequency spectrum SP-H that is the higher frequency components of the spectrum SP supplied from the MDCT unit 11, and from the lower frequency spectrum SP-L that is the lower frequency components of the spectrum SP.
In step S13, the quantizing unit 12 normalizes the lower frequency spectrum SP-L, using the lower frequency envelope ENV-L.
In step S14, the quantizing unit 12 performs quantization on the extracted higher frequency envelope ENV-H, lower frequency envelope ENV-L, and on the normalized lower frequency spectrum SP-L. The quantizing unit 12 supplies the quantized higher frequency envelope ENV-H, lower frequency envelope ENV-L, and the normalized lower frequency spectrum SP-L to the multiplexing unit 13.
In step S15, the multiplexing unit 13 multiplexes the lower frequency envelope ENV-L, the lower frequency spectrum SP-L, and the higher frequency envelope ENV-H, which are supplied from the quantizing unit 12. The multiplexing unit 13 outputs the resultant bit stream. This operation then comes to an end.
FIG. 3 is a block diagram showing an example structure of a decoding apparatus that decodes bit streams encoded by the encoding apparatus 10 of FIG. 1.
The decoding apparatus 30 of FIG. 3 includes a dividing unit 31, an inverse quantizing unit 32, an inverse MDCT unit 33, and a band extending unit 34.
The dividing unit 31, the inverse quantizing unit 32, and the inverse MDCT unit 33 of the decoding apparatus 30 decodes only the lower frequency components of PCM signals, like a conventional transform decoding apparatus.
Specifically, the dividing unit 31 obtains a bit stream encoded by the encoding apparatus 10, and divides the bit stream into the lower frequency envelope ENV-L, the lower frequency spectrum SP-L, and the higher frequency envelope ENV-H. The dividing unit 31 then supplies the lower frequency envelope ENV-L, the lower frequency spectrum SP-L, and the higher frequency envelope ENV-H to the inverse quantizing unit 32.
The inverse quantizing unit 32 performs inverse quantization on the lower frequency envelope ENV-L, the lower frequency spectrum SP-L, and the higher frequency envelope ENV-H, which are supplied from the dividing unit 31. The inverse quantizing unit 32 then supplies the inversely-quantized lower frequency envelope ENV-L and lower frequency spectrum SP-L to the inverse MDCT unit 33, and supplies the higher frequency envelope ENV-H to the band extending unit 34.
Using the lower frequency envelope ENV-L supplied from the inverse quantizing unit 32, the inverse MDCT unit 33 denormalizes the lower frequency spectrum SP-L. The inverse MDCT unit 33 performs an inverse MDCT on the lower frequency spectrum SP-L, which is a denormalized frequency domain signal, and obtains a PCM signal that is a time domain signal. This PCM signal is a PCM signal not containing higher frequency components, and is a PCM signal of auditorily muffled sound. The inverse MDCT unit 33 supplies the PCM signal to the band extending unit 34.
The band extending unit 34 includes a band dividing filter 41, a higher frequency component generating unit 42, and a band combining filter 43. The band extending unit 34 extends the frequency band of the PCM signal that is obtained by the inverse MDCT unit 33 and does not contain higher frequency components. By doing so, the band extending unit 34 performs a band extending operation to improve the sound quality of the PCM signal.
Specifically, the band dividing filter 41 of the band extending unit 34 divides the PCM signal supplied from the inverse MDCT unit 33 into higher frequency components and lower frequency components. Since this PCM signal does not contain higher frequency components, the band dividing filter 41 discards the higher frequency components of the divided PCM signal. The band dividing filter 41 also supplies a lower frequency PCM signal BS-L, which is the lower frequency components of the divided PCM signal, to the higher frequency component generating unit 42 and the band combining filter 43.
Using the lower frequency PCM signal BS-L supplied from the band dividing filter 41 and the higher frequency envelope ENV-H supplied from the inverse quantizing unit 32, the higher frequency component generating unit 42 generates a higher frequency PCM signal to be a pseudo higher frequency PCM signal BS-H. An example method of generating the pseudo higher frequency PCM signal BS-H is disclosed in Patent Document 1, which was filed by the applicant. The higher frequency component generating unit 42 supplies the pseudo higher frequency PCM signal BS-H to the band combining filter 43.
The band combining filter 43 combines the lower frequency PCM signal BS-L supplied from the band dividing filter 41 with the pseudo higher frequency PCM signal BS-H supplied from the higher frequency component generating unit 42, and outputs an entire-band PCM signal as the results of the decoding.
The sound corresponding to the entire-band PCM signal that is output in the above described manner is less muffled than the sound corresponding to the PCM signal not containing higher frequency components, and is a beautiful and comfortable sound.
FIG. 4 is a diagram for explaining the signals that are output from the inverse MDCT unit 33 and the band combining filter 43. In FIG. 4, the abscissa axis indicates frequency, and the ordinate axis indicates signal level. This also applies to FIGS. 7, 10, and 12 through 16, which will be described later.
The signal that is output from the inverse MDCT unit 33 is the PCM signal of the lower frequency spectrum SP-L denormalized by using the lower frequency envelope ENV-L, as shown in A in FIG. 4. The signal that is output from the band combining filter 43 is a PCM signal that contains lower frequency components as the PCM signal of the lower frequency spectrum SP-L denormalized by using the lower frequency envelope ENV-L, and higher frequency components as the pseudo higher frequency PCM signal BS-H generated from the higher frequency envelope ENV-H and the lower frequency PCM signal BS-L, as shown in B in FIG. 4.
FIG. 5 is a flowchart for explaining a decoding operation to be performed by the decoding apparatus 30 of FIG. 3. This decoding operation is started when a bit stream encoded by the encoding apparatus 10 is input to the decoding apparatus 30, for example.
In step S31 of FIG. 5, the dividing unit 31 divides the bit stream input to the decoding apparatus 30 into the lower frequency envelope ENV-L, the lower frequency spectrum SP-L, and the higher frequency envelope ENV-H. The dividing unit 31 then supplies the lower frequency envelope ENV-L, the lower frequency spectrum SP-L, and the higher frequency envelope ENV-H to the inverse quantizing unit 32.
In step S32, the inverse quantizing unit 32 performs inverse quantization on the lower frequency envelope ENV-L, the lower frequency spectrum SP-L, and the higher frequency envelope ENV-H, which are supplied from the dividing unit 31. The inverse quantizing unit 32 supplies the inversely-quantized lower frequency envelope ENV-L and lower frequency spectrum SP-L to the inverse MDCT unit 33. The inverse quantizing unit 32 supplies the higher frequency envelope ENV-H to the band extending unit 34.
In step S33, the inverse MDCT unit 33 denormalizes the lower frequency spectrum SP-L, using the lower frequency envelope ENV-L supplied from the inverse quantizing unit 32.
In step S34, the inverse MDCT unit 33 performs an inverse MDCT on the lower frequency spectrum SP-L, which is a denormalized frequency domain signal, and obtains a PCM signal that is a time domain signal. The inverse MDCT unit 33 supplies the PCM signal to the band extending unit 34.
In step S35, the band dividing filter 41 of the band extending unit 34 divides the PCM signal supplied from the inverse MDCT unit 33 into higher frequency components and lower frequency components. The band dividing filter 41 discards the higher frequency components of the divided PCM signal, and supplies the lower frequency PCM signal BS-L, which is the lower frequency components of the divided PCM signal, to the higher frequency component generating unit 42 and the band combining filter 43.
In step S36, the higher frequency component generating unit 42 generates the pseudo higher frequency PCM signal BS-H, using the lower frequency PCM signal BS-L supplied from the band dividing filter 41 and the higher frequency envelope ENV-H supplied from the inverse quantizing unit 32. The higher frequency component generating unit 42 supplies the pseudo higher frequency PCM signal BS-H to the band combining filter 43.
In step S37, the band combining filter 43 combines the lower frequency PCM signal BS-L supplied from the band dividing filter 41 with the pseudo higher frequency PCM signal BS-H supplied from the higher frequency component generating unit 42, to obtain the entire-band PCM signal. The band combining filter 43 outputs the entire-band PCM signal, and the operation comes to an end.
The above described band extension technique has been already used in HE-AAC (High-Efficiency Advanced Audio Coding), which is an international standard, and in the stereo high-quality mode of LPEC (trade name).
As described above, by the conventional band extension technique, the band extending operation is performed as the post processing for the decoding of the lower frequency spectrum SP-L. Accordingly, the degree of freedom of the pseudo higher frequency PCM signal BS-H can be made higher. That is, the pseudo higher frequency PCM signal BS-H can be generated not from the lower frequency spectrum SP-L, which is a frequency domain signal, but from the lower frequency PCM signal BS-L, which is a time domain signal.
The processing block sizes in the encoding operation and the decoding operation, and the processing block size in the band extending operation are arbitrarily set, so as to optimize frequency analysis precision and time resolving precision.
In a case where the pseudo higher frequency PCM signal is generated by the technique disclosed in Patent Document 1, complicated procedures need to be carried out to generate a noise spectrum from the higher frequency envelope ENV-H, generate a tonic spectrum from the higher frequency envelope ENV-H and the lower frequency PCM signal BS-L, and compare the two spectrums.
The process of generating the noise spectrum and the tonic spectrum is the necessary process in increasing the matching accuracy between the lower frequency spectrum and the higher frequency spectrum to generate sound with high auditory quality, and is also performed in the decoding apparatuses disclosed in Patent Documents 2 and 3.