A coding apparatus which codes multichannel audio signals can perform highly efficient coding by utilizing a relationship between channels. This coding includes, for example, intensity coding, M/S stereo coding and spatial coding. A coding apparatus which performs spatial coding downmixes an n channel audio signal into a m (m<n) channel audio signal and codes the signal, finds spatial parameters representing the inter-channel relationship upon downmixing and transmits the spatial parameters together with the coded data. A decoding apparatus which receives the spatial parameters and the coded data decodes the coded data, and restores the original n channel audio signal from the m channel audio signal obtained as a result of decoding using the spatial parameter.
This spatial coding is known as “binaural cue coding”. For the spatial parameters (hereinafter, referred to as “BC parameters”), for example, ILD (Inter-channel Level Difference), IPD (Inter-channel Phase Difference) and ICC (Inter-channel Correlation) are used. The ILD refers to a parameter indicating the ratio of the magnitude of an inter-channel signal. The IPD refers to a parameter indicating an inter-channel phase difference, and the ICC refers to a parameter indicating an inter-channel correlation.
FIG. 1 is a block diagram illustrating a configuration example of a coding apparatus which performs spatial coding.
In addition, n=2 and m=1 for ease of description. That is, a coding target audio signal is a stereo audio signal (hereinafter, referred to as “stereo signal”), and coded data obtained as a result of coding is coded data of a monaural audio signal (hereinafter, referred to as “monaural signal”).
A coding apparatus 10 in FIG. 1 includes a channel donwmix unit 11, a spatial parameter detection unit 12, an audio signal coding unit 13 and a multiplexing unit 14. The coding apparatus 10 receives an input of a stereo signal including a left audio signal XL and a right audio signal XR as a coding target, and outputs coded data of a monaural signal.
More specifically, the channel downmix unit 11 of the coding apparatus 10 downmixes the stereo signal input as the coding target, to the monaural signal XM. Further, the channel downmix unit 11 supplies the monaural signal to the spatial parameter detection unit 12 and the audio signal coding unit 13.
The spatial parameter detection unit 12 detects the BC parameters based on the monaural signal XM supplied from the channel downmix unit 11 and the stereo signal input as the coding target, and supplies the BC parameters to the multiplexing unit 14.
The audio signal coding unit 13 codes the monaural signal supplied from the channel downmix unit 11, and supplies resulting coded data to the multiplexing unit 14.
The multiplexing unit 14 multiplexes and outputs the coded data supplied from the audio signal coding unit 13 and the BC parameter supplied from the spatial parameter detection unit 12.
FIG. 2 is a block diagram illustrating a configuration example of the audio signal coding unit 13 in FIG. 1.
In addition, the audio signal coding unit 13 in FIG. 2 employs a configuration where the audio signal coding unit 13 performs coding according to, for example, MPEG-2 AAC LC (Moving Picture Experts Group phase 2 Advanced Audio Coding Low Complexity) profile. Meanwhile, the configuration is simplified and illustrated in FIG. 2 for ease of description.
The audio signal coding unit 13 in FIG. 2 includes a MDCT (Modified Discrete Cosine Transform) unit 21, a spectrum quantization unit 22, an entropy coding unit 23 and a multiplexing unit 24.
The MDCT unit 21 performs MDCT of the monaural signal supplied from the channel downmix unit 11, and transforms a monaural signal which is a time domain signal, into a MDCT coefficient which is a frequency domain coefficient. The MDCT unit 21 supplies the MDCT coefficient obtained as a result of transform, to the spectrum quantization unit 22 as a frequency spectrum coefficient.
The spectrum quantization unit 22 quantizes the frequency spectrum coefficient supplied from the MDCT unit 21, and supplies the frequency spectrum coefficient to the entropy coding unit 23. Further, the spectrum quantization unit 22 supplies quantization information which is information related to this quantization, to the multiplexing unit 24. The quantization information includes, for example, a scale factor and quantization bit information.
The entropy coding unit 23 performs entropy coding such as Huffman coding or arithmetic coding of the quantized frequency spectrum coefficient supplied from the spectrum quantization unit 22, and losslessly compresses the frequency spectrum coefficient. The entropy coding unit 23 supplies data obtained as a result of entropy coding, to the multiplexing unit 24.
The multiplexing unit 24 multiplexes the data supplied from the entropy coding unit 23 and the quantization information supplied from the spectrum quantization unit 22, and supplies resulting data to the multiplexing unit 14 (FIG. 1) as coded data.
FIG. 3 is a block diagram illustrating another configuration example of the audio signal coding unit 13 in FIG. 1.
In addition, the audio signal coding unit 13 in FIG. 3 employs a configuration of performing coding according to, for example, a MPEG-2 AAC SSR (Scalable Sample Rate) profile or MP3 (MPEG Audio Layer-3). Meanwhile, the configuration is simplified and illustrated in FIG. 3 for ease of description.
The audio signal coding unit 13 in FIG. 3 includes an analysis filter bank 31, MDCT units 32-1 to 32-N (N is an arbitrary integer), a spectrum quantization unit 33, an entropy coding unit 34 and a multiplexing unit 35.
The analysis filter bank 31 includes, for example, a QMF (Quadrature Mirror Filterbank) bank or a PQF (Poly-phase Quadrature Filter) bank. The analysis filter bank 31 divides the monaural signal supplied from the channel downmix unit 11, into N groups according to a frequency. The analysis filter bank 31 supplies N subband signals obtained as a result of division, to the MDCT units 32-1 to 32-N.
The MDCT units 32-1 to 32-N each perform MDCT of the subband signal supplied from the analysis filter bank 31, and transforms the subband signal which is a time domain signal, into a MDCT coefficient which is a frequency domain coefficient. Further, the MDCT units 32-1 to 32-N each supply the MDCT coefficient of each subband signal to the spectrum quantization unit 33 as the frequency spectrum coefficient.
The spectrum quantization unit 33 quantizes each of the N frequency spectrum coefficients supplied from the MDCT units 32-1 to 32-N, and supplies the N frequency spectrum coefficients to the entropy coding unit 34. Further, the spectrum quantization unit 33 supplies quantization information about this quantization, to the multiplexing unit 35.
The entropy coding unit 34 performs entropy coding such as Huffman coding or arithmetic coding of each of the quantized N frequency spectrum coefficients supplied from the spectrum quantization unit 33, and losslessly compresses the N frequency spectrum coefficients. The entropy coding unit 34 supplies N items of data obtained as a result of entropy coding, to the multiplexing unit 35.
The multiplexing unit 35 multiplexes the N items of data supplied from the entropy coding unit 34 and the quantization information supplied from the spectrum quantization unit 33, and supplies resulting data to the multiplexing unit 14 (FIG. 1) as coded data.
FIG. 4 is a block diagram illustrating a configuration example of a decoding apparatus which decodes coded data which is spatially coded by the coding apparatus 10 in FIG. 1.
A decoding apparatus 40 in FIG. 4 includes an inverse multiplexing unit 41, an audio signal decoding unit 42, a generation parameter calculation unit 43 and a stereo signal generation unit 44. The decoding apparatus 40 decodes the coded data supplied from the coding apparatus in FIG. 1, and generates a stereo signal.
More specifically, the inverse multiplexing unit 41 of the decoding apparatus 40 inversely multiplexes the multiplexed coded data supplied from the coding apparatus 10 in FIG. 1, and obtains the coded data and the BC parameter. The inverse multiplexing unit 41 supplies the coded data to the audio signal decoding unit 42, and supplies the BC parameter to the generation parameter calculation unit 43.
The audio signal decoding unit 42 decodes the coded data supplied from the inverse multiplexing unit 41, and supplies the resulting monaural signal XM which is a time domain signal, to the stereo signal generation unit 44.
The generation parameter calculation unit 43 calculates generation parameters which are parameters for generating a stereo signal from a monaural signal which is a decoding result of the multiplexed coded data, using the BC parameter supplied from the inverse multiplexing unit 41. The generation parameter calculation unit 43 supplies these generation parameters to the stereo signal generation unit 44.
The stereo signal generation unit 44 generates the left audio signal XL and the right audio signal XR from the monaural signal XM supplied from the audio signal decoding unit 42 using the generation parameters supplied from the generation parameter calculation unit 43. The stereo signal generation unit 44 outputs the left audio signal XL and the right audio signal XR as stereo signals.
FIG. 5 is a block diagram illustrating a configuration example of the audio signal decoding unit 42 in FIG. 4.
In addition, the audio signal decoding unit 42 in FIG. 5 employs a configuration where coded data coded according to, for example, the MPEG-2 AAC LC profile is input to the decoding apparatus 40. That is, the audio signal decoding unit 42 in FIG. 5 decodes the coded data coded by the audio signal coding unit 13 in FIG. 2.
The audio signal decoding unit 42 in FIG. 5 includes an inverse multiplexing unit 51, an entropy decoding unit 52, a spectrum inverse quantization unit 53 and an IMDCT unit 54.
The inverse multiplexing unit 51 inversely multiplexes the coded data supplied from the inverse multiplexing unit 41 in FIG. 4, and obtains the quantized and entropy-coded frequency spectrum coefficient and the quantization information. The inverse multiplexing unit 51 supplies the quantized and entropy-coded frequency spectrum coefficient to the entropy decoding unit 52, and supplies the quantization information to the spectrum inverse quantization unit 53.
The entropy decoding unit 52 performs entropy decoding such as Huffman decoding or arithmetic decoding of the frequency spectrum coefficient supplied from the inverse multiplexing unit 51, and restores the quantized frequency spectrum coefficient. The entropy decoding unit 52 supplies this frequency spectrum coefficient to the spectrum inverse quantization unit 53.
The spectrum inverse quantization unit 53 inversely quantizes the quantized frequency spectrum coefficient supplied from the entropy decoding unit 52 based on the quantization information supplied from the inverse multiplexing unit 51, and restores the frequency spectrum coefficient. Further, the spectrum inverse quantization unit 53 supplies the frequency spectrum coefficient to the IMDCT (Inverse MDCT) (Inverse Modified Discrete Cosine Transform) unit 54.
The IMDCT unit 54 performs IMDCT of the frequency spectrum coefficient supplied from the spectrum inverse quantization unit 53, and transforms the frequency spectrum coefficient into the monaural signal XM which is a time domain signal. The IMDCT unit 54 supplies this monaural signal XM to the stereo signal generation unit 44 (FIG. 4).
FIG. 6 is a block diagram illustrating another configuration example of the audio signal decoding unit 42 in FIG. 4.
In addition, the audio signal decoding unit 42 in FIG. 6 employs a configuration where coded data coded according to, for example, the MPEG-2 AAC SSR profile or a method such as MP3 is input to the decoding apparatus 40. That is, the audio signal decoding unit 42 in FIG. 6 decodes the coded data coded by the audio signal coding unit 13 in FIG. 3.
The audio signal decoding unit 42 in FIG. 6 includes an inverse multiplexing unit 61, an entropy decoding unit 62, a spectrum inverse quantization unit 63, IMDCT units 64-1 to 64-N and a synthesis filter bank 65.
The inverse multiplexing unit 61 inversely multiplexes the coded data supplied from the inverse multiplexing unit 41 in FIG. 4, and obtains the quantized and entropy-coded frequency spectrum coefficients of the N subband signals and the quantization information. The inverse multiplexing unit 61 supplies the quantized and entropy-coded frequency spectrum coefficients of the N subband signals to the entropy decoding unit 62, and supplies the quantization information to the spectrum inverse quantization unit 63.
The entropy decoding unit 62 performs entropy decoding such Huffman decoding or arithmetic decoding of the frequency spectrum coefficients of the N subband signals supplied from the inverse multiplexing unit 61, and supplies the frequency spectrum coefficients to the spectrum inverse quantization unit 63.
The spectrum inverse quantization unit 63 inversely quantizes each of the frequency spectrum coefficients of the N subband signals which are supplied from the entropy decoding unit 62 and which are obtained as a result of entropy decoding, based on the quantization information supplied from the inverse multiplexing unit 61. By this means, the frequency spectrum coefficients of the N subband signals are restored. The spectrum inverse quantization unit 63 supplies the restored frequency spectrum coefficients of the N subband signals to the IMDCT units 64-1 to 64-N one by one.
The IMDCT units 64-1 to 64-N each perform IMDCT of the frequency spectrum coefficient supplied from the spectrum inverse quantization unit 63, and transform the frequency spectrum coefficient into a subband signal which is a time domain signal. The IMDCT units 64-1 to 64-N each supply the subband signal obtained as a result of transform, to the synthesis filter bank 65.
The synthesis filter bank 65 includes, for example, an inverse PQF and an inverse QMF. The synthesis bank 65 synthesizes the N subband signals supplied from the IMDCT units 64-1 to 64-N, and supplies the resulting signal to the stereo signal generation unit 44 (FIG. 4) as the monaural signal XM.
FIG. 7 is a block diagram illustrating a configuration example of the stereo signal generation unit 44 in FIG. 4.
The stereo signal generation unit 44 in FIG. 7 includes a reverb signal generation unit 71 and a stereo synthesis unit 72.
The reverb signal generation unit 71 generates a signal XD which is uncorrelated with this monaural signal XM using the monaural signal XM supplied from the audio signal decoding unit 42 in FIG. 4. For the reverb signal generation unit 71, a comb filter or an all pass filter is generally used. In this case, the reverb signal generation unit 71 generates a reverb signal of the monaural signal XM as the signal XD.
In addition, for the reverb signal generation unit 71, a feedback delay network (FDN) is used in some cases (see, for example, Patent Document 1).
The reverb signal generation unit 71 supplies the generated signal XD to the stereo synthesis unit 72.
The stereo synthesis unit 72 synthesizes the monaural signal XM supplied from the audio signal decoding unit 42 in FIG. 4 and the signal XD supplied from the reverb signal generation unit 71 using the generation parameters supplied from the generation parameter calculation unit 43 in FIG. 4. Further, the stereo synthesis unit 72 outputs the left audio signal XL and the right audio signal XR obtained as a result of synthesis as stereo signals.
FIG. 8 is a block diagram illustrating another configuration example of the stereo signal generation unit 44 in FIG. 4.
The stereo signal generation unit 44 in FIG. 8 includes an analysis filter bank 81, subband stereo signal generation units 82-1 to 82-P (P is an arbitrary number) and a synthesis filter bank 83.
In addition, when the stereo signal generation unit 44 in FIG. 4 employs the configuration illustrated in FIG. 8, the spatial parameter detection unit 12 of the coding apparatus 10 in FIG. 1 detects the BC parameter per subband signal.
More specifically, for example, the spatial parameter detection unit 12 has two analysis filter banks. Further, in the spatial parameter detection unit 12, one analysis filter bank divides the stereo signal according to a frequency, and the other analysis filter bank divides the monaural signal from the channel downmix unit 11 according to a frequency. The spatial parameter detection unit 12 detects the BC parameter per subband signal based on the subband signal of the stereo signal and the subband signal of the monaural signal obtained as a result of division. Further, the generation parameter calculation unit 43 in FIG. 4 receives a supply of the BC parameter of each subband signal from the inverse multiplexing unit 41, and generates generation parameters per subband signal.
The analysis filter bank 81 includes, for example, a QMF (Quadrature Mirror Filter) bank. The analysis filter bank 81 divides the monaural signal XM supplied from the audio signal decoding unit 42 in FIG. 4 into P groups according to a frequency. The analysis filter bank 81 supplies P subband signals obtained as a result of division, to the subband stereo signal generation units 82-1 to 82-P.
The subband stereo signal generation units 82-1 to 82-P each include a reverb signal generation unit and a stereo synthesis unit. The configuration of each of the subband stereo signal generation units 82-1 to 82-P is the same, and therefore only the subband stereo signal generation unit 82-B will be described.
The subband stereo signal generation unit 82-B includes a reverb signal generation unit 91 and a stereo synthesis unit 92. The reverb signal generation unit 91 generates a signal XDB which is irrelevant to this subband signal XmB using the subband signal XmB of the monaural signal supplied from the analysis filter bank 81, and supplies the signal XDB to the stereo synthesis unit 92.
The stereo synthesis unit 92 synthesizes the subband signal XmB supplied from the analysis filter bank 81 and the signal XDB supplied from the reverb signal generation unit 91 using the generation parameters of the subband signal XmB supplied from the generation parameter calculation unit 43 in FIG. 4. Further, the stereo synthesis unit 92 supplies the left audio signal XLB and the right audio signal XRB obtained as a result of synthesis, to the synthesis filter bank 83 as subband signals of the stereo signals.
The synthesis filter bank 83 synthesizes left and right stereo signals of each subband signal supplied from the subband stereo signal generation units 82-1 to 82-P at a time. The synthesis filter bank 83 outputs the resulting left audio signal XL and right audio signal XR as stereo signals.
In addition, the configuration of the stereo signal generation unit 44 in FIG. 8 is disclosed, in for example, Patent Document 2.
Further, a coding apparatus which performs intensity coding mixes the frequency spectrum coefficient of each channel at a frequency equal to or more than a predetermined frequency band of the input stereo signal, and generates the frequency spectrum coefficient of the monaural signal. Further, the coding apparatus outputs a level ratio of the frequency spectrum coefficient of this monaural signal and an inter-channel frequency spectrum coefficient as a coding result.
More specifically, the coding apparatus which performs intensity coding performs MDCT with respect to the stereo signal, and mixes and shares the frequency spectrum coefficient of each channel at a frequency equal to or more than a predetermined frequency band among resulting frequency spectrum coefficients of channels. Further, the coding apparatus which performs intensity coding quantizes and entropy-codes the shared frequency spectrum coefficient, and multiplexes resulting data and quantization information as coded data. Furthermore, the coding apparatus which performs intensity coding finds the level ratio of the inter-channel frequency spectrum coefficients, and multiplexes and outputs the level ratio and the coded data.
Still further, a decoding apparatus which performs intensity decoding inversely multiplexes the coded data on which the level ratio of the inter-channel frequency spectrum coefficients is multiplexed, entropy-decodes resulting coded data and inversely quantizes the coded data based on the quantization information. Moreover, the decoding apparatus which performs intensity decoding restores the frequency spectrum coefficient of each channel based on the level ratio of the frequency spectrum coefficient obtained as a result of inverse quantization and the inter-channel frequency spectrum coefficients multiplexed on the coded data. Moreover, the decoding apparatus which performs intensity decoding performs IMDCT of the restored frequency spectrum coefficient of each channel, and obtains a stereo signal at a frequency equal to or more than a predetermined frequency band.
Although such intensity coding ratio is usually used to improve a coding efficiency, a high band frequency spectrum coefficient of a stereo signal is monaural-coded and represented only by an inter-channel level difference, and therefore the original stereophonic effect is slightly lost.