In recent years, a technology called Spatial Codec has been developed. This technology is designed to compress and encode multichannel realism on the basis of an extremely small amount of information. For example, the AAC method, which is a multichannel codec already widely used as an audio method for digital television, requires a bit rate such as 512 kbps or 384 kbps for 5.1 channels. On the other hand, the Spatial Codec aims to compress and encode multichannel signals at an extremely low bit rate such as 128 kbps, 64 kbps, or even 48 kbps. International standardization activities to achieve this aim are ongoing by the MPEG audio standardization conference, and so-called Reference Model Zero (also referred to as “RM0” hereafter) which is a basic processing method for the spatial audio codec is disclosed (see Non-patent document 1).
Here, an explanation is given as to a basic principle of the Spatial Codec.
FIG. 1 is a diagram for explaining the basic principle of the Spatial Codec in the case of two channels of L and R as an example.
In an encoding process, a spatial audio encoder obtains a down-mixed signal S (S=(L+R)/2), a level difference c, and a phase difference θ through complex calculations based on acoustic signals from the two channels of L and R, as shown in FIG. 1(a). The down-mixed signal S is further encoded, together with the level difference c and the phase difference θ, by an encoding apparatus manufactured under the standard such as the MPEG AAC standard.
In a decoding process, a decorrelated signal D, which is orthogonal to the down-mixed signal S and carries reverberations, is generated as shown in FIG. 1(b).
Then, as shown in FIG. 1(c), the down-mixed signal S and the decorrelated signal D are mixed so that acoustic signals of the two channels of L and R that satisfy the relationship of a parallelogram shown in FIG. 1(a) are generated on the basis of the decoded level difference c and the decoded phase difference θ.
The explanation has been given here for the case where two channels are down mixed to one channel and one channel is multiplied to two channels. By repeating this principle a plural number of times, 5.1 channels can be down mixed to two channels, and the two channels can be multiplied to the 5.1 channels, for example.
Next, an explanation is given as to a signal flow in the case of RM0.
FIG. 2 is a block diagram showing a functional structure of an acoustic signal processing apparatus 900 which converts two-channel signals to five-channel signals, the conversion being an example of a basic signal flow in the case of RM0.
Here, note that inputs of the two channels are down-mixed from original five-channel signals and that outputs of the five channels are restored to the original five-channel signals. Also note that the two-channel signals refer to signals usually outputted respectively from front left and right speakers and that the five-channel signals refer to signals usually outputted respectively from front left and right speakers, rear left and right speakers, and a front center speaker.
As shown in FIG. 2, the acoustic signal processing apparatus 900 includes a pre-mixing matrix M1 (901), decorrelators (also described as “De correlators” or “Decorrelators”) 902 and 903, and a post-mixing matrix M2 (904).
The pre-mixing matrix M1 (901) converts the inputs of an input 1 and an input 2 to five-channel signals through a process whereby matrix arithmetic related to gain control is performed on the inputs. Out of the five-channel signals, signals of two channels are respectively converted to incoherent signals through processes performed by the decorrelators 902 and 903. The post-mixing matrix M2 (904) generates the outputs of the five-channel signals through a process whereby matrix arithmetic related to phase control is performed on signals of five channels in total, including the signals of the two channels converted by the decorrelators 902 and 903 and the unconverted signals of the remaining three channels.
FIG. 3 is a block diagram showing a more detailed functional structure of the acoustic signal processing apparatus 900. It should be noted here that although FIG. 2 shows the signals flow from left to right, FIG. 3 shows the signals flow from right to left. Since the insides of the pre-mixing matrix M1 (901) and the post-mixing matrix M2 (904) are defined by the matrix arithmetic, the diagram of FIG. 3 is illustrated to show that the signals flow from right to left only in order for mathematical expressions of matrix arithmetic expressions to agree with the flow of the signals. Thus, the diagram is essentially the same as that of FIG. 2.
In addition to the pre-mixing matrix M1 (901), the decorrelators 902 and 903, and the post-mixing matrix M2 (904) described above, the acoustic signal processing apparatus 900 further includes two determinant generation units 905 and 907, and two interpolation units 906 and 908.
As shown in FIG. 3, the signal processing for the pre-mixing matrix M1 (901) is realized by a determinant of a five-row*two-column matrix. In general, a determinant shown below as Equation (1) is defined as an example of the pre-mixing matrix M1 (901).
                    [                  Equation          ⁢                                          ⁢          1                ]                                                                                      ⁢                                            R              1                              l                ,                m                                      =                                          γ                                  l                  ,                  m                                            ⁢                                                1                  3                                ⁡                                  [                                                                                                                                          α                                                          l                              ,                              m                                                                                +                          2                                                                                                                                                  β                                                          l                              ,                              m                                                                                -                          1                                                                                            1                                                                                                                                                                  α                                                          l                              ,                              m                                                                                -                          1                                                                                                                                                  β                                                          l                              ,                              m                                                                                +                          2                                                                                            1                                                                                                                                                                  (                                                          1                              -                                                              α                                                                  l                                  ,                                  m                                                                                                                      )                                                    ⁢                                                      2                                                                                                                                                                            (                                                          1                              -                                                              β                                                                  l                                  ,                                  m                                                                                                                      )                                                    ⁢                                                      2                                                                                                                                                -                                                      2                                                                                                                                                                                                                    α                                                          l                              ,                              m                                                                                +                          2                                                                                                                                                  β                                                          l                              ,                              m                                                                                -                          1                                                                                            1                                                                                                                                                                  α                                                          l                              ,                              m                                                                                -                          1                                                                                                                                                  β                                                          l                              ,                              m                                                                                +                          2                                                                                            1                                                                              ]                                                              ,                                    (        1        )            
In Equation (1), α and β are values obtained from acoustic spatial coefficients called CPC (Channel Prediction Coefficients), and γ is a value obtained from an acoustic spatial coefficient called an ICC (Inter Channel Correlation).
Additionally, a superscript I indicates that the data comes from an Ith parameter set (an aggregate of compressed and encoded parameters). Also, a superscript m indicates that the data comes from an mth frequency band. Details of their respective meanings are omitted here since they are not related to the scope of the present invention.
Equation (1) is a determinant of a five-row*three-column matrix, in which the third column has a meaning only when so-called Residual Coding described in Non-patent document 1 is performed. In most cases, Residual Coding is not performed usually in view of restriction on the bit rate and reduction in the decoding arithmetic load. In such a case, Equation (1) can be considered as Equation (2) below.
                    [                  Equation          ⁢                                          ⁢          2                ]                                                                                      ⁢                              R            1                          l              ,              m                                =                                    γ                              l                ,                m                                      ⁢                                          1                3                            ⁡                              [                                                                                                                              α                                                      l                            ,                            m                                                                          +                        2                                                                                                                                      β                                                      l                            ,                            m                                                                          -                        1                                                                                                                                                                          α                                                      l                            ,                            m                                                                          -                        1                                                                                                                                      β                                                      l                            ,                            m                                                                          +                        2                                                                                                                                                                          (                                                      1                            -                                                          α                                                              l                                ,                                m                                                                                                              )                                                ⁢                                                  2                                                                                                                                                              (                                                      1                            -                                                          β                                                              l                                ,                                m                                                                                                              )                                                ⁢                                                  2                                                                                                                                                                                                  α                                                      l                            ,                            m                                                                          +                        2                                                                                                                                      β                                                      l                            ,                            m                                                                          -                        1                                                                                                                                                                          α                                                      l                            ,                            m                                                                          -                        1                                                                                                                                      β                                                      l                            ,                            m                                                                          +                        2                                                                                            ]                                                                        (        2        )            
To be more specific, Equation (2) corresponds to the determinant shown on the right-hand part of FIG. 3. It is obvious that, when Residual Coding is performed, the determinant shown on the right-hand part of FIG. 3 is to be a determinant of a five-row*three-column matrix according to Equation (1) and a Residual Signal is added as an input signal so that there would be three channels.
Out of the five-channel signals generated as described so far, signals of two channels are respectively converted to incoherent signals through processes performed by the decorrelators 902 and 903. The signals of the five channels in total, including the signals of the two channels converted in this way and the unconverted signals of the remaining three channels, are converted through the process of the post-mixing matrix M2 (904), so that the five-channel signals are generated as outputs. This signal processing is realized by a five-row*five-column matrix arithmetic expression.
For the sake of simplification, a five-row*five-column matrix arithmetic expression is given as one example here. Note that this is intended for the case of five channels including front two channels, rear two channels, and a center channel. Thus, when an LFE channel is added, the matrix of this determinant would have six rows and five columns. Moreover, when a decorrelator is used for a so-called Ttt Element described in Non-patent document 1, the matrix of this determinant would have six rows and six columns since one channel is added to the input side of the present matrix arithmetic.
Here, elements (coefficients) of each determinant in the matrix arithmetic are generated on the basis of parameters encoded from the channel level differences, the inter-channel correlations (phase differences), and the channel prediction coefficients among the original five-channel signals.
First, information of the encoded channel level differences, inter-channel correlations (phase differences), and channel prediction coefficients is decoded, so as to obtain the channel level differences, the inter-channel phase differences, and the prediction coefficients which are required when the determinant generation units 905 and 907 divide the two-channel signals into the five-channel signals.
These encoded signals are updated for each frame, which is a predetermined time interval. For this reason, the interpolation units 906 and 908 perform smoothing on the values of the level difference and the phase difference in order to smooth out variations between a current frame and a preceding frame. In this way, each element of the matrix arithmetic expressions of the pre-mixing matrix M1 (901) and the post-mixing matrix M2 (904) is determined. The process of determining each element of the matrix arithmetic expressions is not particularly related to the scope of the present invention and, therefore, the detailed explanation is omitted here.
Moreover, Non-patent document 1 describes that the processing performed by the decorrelators 902 and 903 is to generate a signal incoherent with the input signal in terms of temporal characteristics while maintaining frequency characteristics of the input signal, and also describes that lattice all-pass filters are used as a method.    Non-patent document 1: J. Herre, et al, “The Reference Model Architecture for MPEG Spatial Audio Coding”, 118th AES Convention, Barcelona, May 28-31, 2005, Audio Engineering Society Convention Paper 6447.