The parametric stereophonic encoding technique is adopted in the high-efficiency advanced audio coding (HE-AAC) version 2 standard (hereinafter referred to as “HE-AAC v2”), as one of the MPEG-4 Audio standards. The parametric stereophonic encoding technique as an audio compression technique substantially improves a codec efficiency of a low-bit rate stereophonic signal, and is optimum for applications in mobile devices, broadcasting, and the Internet.
FIG. 16 illustrates a model for stereophonic recording. In this model, two microphones #1 and #2, namely, microphones 16011 and 16012 pick up a sound emitted from a sound source x(t). Here, c1x(t) represents a direct-path wave reaching the microphone 16011 and c2h(t)*x(t) represents a reflected wave reaching the microphone 16011 after being reflected off walls of a room. Here, t is time, and h(t) is an impulse response representing transfer characteristics of the room. The symbol “*” represents a convolution operation, and c1 and c2 represent gain. Similarly, c3x(t) represents a direct wave reaching the microphone 16012 and c4h(t)*x(t) is a reflected wave reaching the microphone 16012. Let l(t) and r(t) represent respectively the signals picked up by the microphone 16011 and the microphone 16012, and l(t) and r(t) are linear sums of the direct wave and the reflected wave as below:l(t)=c1x(t)+c2h(t)*x(t)  (1)r(t)=c3x(t)+c4h(t)*x(t)  (2)
Since a HE-AAC v2 decoder cannot obtain a signal equivalent to the sound source x(t) illustrated in FIG. 16, a stereophonic signal is approximately derived from a monophonic signal s(t). The first term and the second term of the following equations (3) and (4) approximate a direct wave and a reflected wave (reverberation component), respectively:l′(t)=c′1s(t)+c′2h′(t)*s(t)  (3)r′(t)=c′3s(t)+c′4h′(t)*s(t)  (4)
A variety of production methods of the reverberation component are available. For example, a parametric stereophonic (hereinafter referred to as PS) decoder complying with the HE-AAC v2 standard decorrelates (orthogonalizes) a monophonic signal s(t) in order to generate a reverberation signal d(t) and generates a stereophonic signal in accordance with the following equations:l′(t)=c′1s(t)+c′2d(t)  (5)r′(t)=c′3s(t)+c′4d(t)  (6)
For convenience of explanation, the process described above is performed in the time domain. The PS decoder performs a pseudo-stereophonic operation in the time-frequency domain (quadrature mirror filter bank (QMF) coefficient domain). Equations (5) and (6) are thus represented by the following equations (7) and (8) respectively:l′(b,t)=h11s(b,t)+h12d(b,t)  (7)r′(b,t)=h21s(b,t)+h22d(b,t)  (8)
where b is an index representing frequency, and t is an index representing time.
A method of producing a reverberation signal d(b,t) from a monophonic signal s(b,t) is described below. A variety of techniques are available to generate the reverberation signal d(b,t). The PS decoder complying with the HE-AAC v2 standard decorrelates (orthogonalizes) the monophonic signal s(b,t) as illustrated in FIG. 17 into the reverberation signal d(b,t) using an infinite impulse response (IIR) type all-pass filter.
FIG. 18 illustrates a relationship of an input signal (L, R), a monophonic signal s, and a reverberation signal d. As illustrated in FIG. 18, let α represent an angle made between the monophonic signal s and each of the input signal S and the input signal R, and cos(2α) is defined as a similarity. An HE-AAC v2 encoder encodes α as similarity information. The similarity information represents a similarity between the L channel input signal and the R channel input signal.
For simplicity of explanation, the lengths of L and R are equal to each other in FIG. 18. Considering the case in which the lengths (norms) of L and R are different from each other, the norm ratio of L to R is defined as an intensity difference. The encoder thus encodes the norm ratio as intensity difference information. The intensity difference information thus represents the power ratio of the L channel input signal to the R channel input signal.
A method of the decoder of generating a stereophonic signal from the monophonic signal s(b,t) and the reverberation signal d(b,t) is described below. Referring to FIG. 19, S represents a decoded input signal, D represents a reverberation signal obtained at the decoder, Cl represents a scale factor of the L channel signal calculated from the intensity difference. A vector results from combining a result of projecting the monophonic signal scaled by Cl at an angle of α and a result of projecting the reverberation signal scaled by Cl at an angle of (π/2−α). The vector is thus set to be a decoded L channel signal. The process is expressed by equation (9). Similarly, the R channel signal is generated in accordance with equation (10) using a scale factor Cr, the decoded input signal S, the reverberation signal D, and the angle α. Cl and Cr are related as C1+Cr=2:
                                                                                          L                  ′                                ⁡                                  (                                      b                    ,                    t                                    )                                            =                            ⁢                                                                    C                    l                                    ⁢                                      s                    ⁡                                          (                                              b                        ,                        t                                            )                                                        ⁢                  cos                  ⁢                                                                          ⁢                  α                                +                                                      C                    l                                    ⁢                                      d                    ⁡                                          (                                              b                        ,                        t                                            )                                                        ⁢                                      cos                    ⁡                                          [                                                                        π                          /                          2                                                -                        α                                            ]                                                                                                                                              =                            ⁢                                                                    C                    l                                    ⁢                                      s                    ⁡                                          (                                              b                        ,                        t                                            )                                                        ⁢                  cos                  ⁢                                                                          ⁢                  α                                +                                                      C                    l                                    ⁢                                      d                    ⁡                                          (                                              b                        ,                        t                                            )                                                        ⁢                  sin                  ⁢                                                                          ⁢                  α                                                                                        (        9        )                                                                                                      R                  ′                                ⁡                                  (                                      b                    ,                    t                                    )                                            =                            ⁢                                                                    C                    r                                    ⁢                                      s                    ⁡                                          (                                              b                        ,                        t                                            )                                                        ⁢                  cos                  ⁢                                                                          ⁢                                      (                                          -                      α                                        )                                                  -                                                      C                    r                                    ⁢                                      d                    ⁡                                          (                                              b                        ,                        t                                            )                                                        ⁢                                      cos                    ⁡                                          [                                                                        π                          /                          2                                                -                        α                                            ]                                                                                                                                              =                            ⁢                                                                    C                    r                                    ⁢                                      s                    ⁡                                          (                                              b                        ,                        t                                            )                                                        ⁢                                      cos                    ⁡                                          (                                              -                                                                                                  ⁢                        α                                            )                                                                      +                                                      C                    r                                    ⁢                                      d                    ⁡                                          (                                              b                        ,                        t                                            )                                                        ⁢                  sin                  ⁢                                                                          ⁢                                      (                                          -                      α                                        )                                                                                                          (        10        )            
Equations (9) and (10) are combined as equations (11) and (12):
                              [                                                                                          L                    ′                                    ⁡                                      (                                          b                      ,                      t                                        )                                                                                                                                            R                    ′                                    ⁡                                      (                                          b                      ,                      t                                        )                                                                                ]                =                              [                                                                                h                    11                                                                                        h                    12                                                                                                                    h                    21                                                                                        h                    22                                                                        ]                    ⁡                      [                                                                                s                    ⁡                                          (                                              b                        ,                        t                                            )                                                                                                                                        d                    ⁡                                          (                                              b                        ,                        t                                            )                                                                                            ]                                              (        11        )                                          H          =                      [                                                                                h                    11                                                                                        h                    12                                                                                                                    h                    21                                                                                        h                    22                                                                        ]                          ⁢                                  ⁢                                                                                                  h                    11                                    =                                                            C                      l                                        ⁢                    cos                    ⁢                                                                                  ⁢                    α                                                  ,                                                                                      h                  12                                =                                                      C                    l                                    ⁢                  sin                  ⁢                                                                          ⁢                  α                                                                                                                                              h                    21                                    =                                                            C                      r                                        ⁢                                          cos                      ⁡                                              (                                                  -                          α                                                )                                                                                            ,                                                                                      h                  22                                =                                                      C                    r                                    ⁢                                      sin                    ⁡                                          (                                              -                        α                                            )                                                                                                                              (        12        )            
A parametric stereophonic decoding apparatus operating on the above-described principle is described below. FIG. 20 illustrates a basic structure of the parametric stereophonic decoding apparatus. A data separator 2001 separates encoded core data and PS data from received input data.
A core decoder 2002 decodes the encoded core data and outputs a monophonic audio signal S(b,t). Here, b represents an index of a frequency band. The core decoder 2002 may be based on a known audio encoding and decoding technique such as an advanced audio coding (AAC) system or a spectral band replication (SBR) system.
The monophonic audio signal S(b,t) and the PS data are input to a parametric stereophonic (PS) decoder 2003. The PS decoder 2003 converts the monophonic audio signal S(b,t) into stereophonic decoded signals L(b,t) and R(b,t) in the frequency domain in accordance with the information of the PS data.
Frequency-time converters 2004(L) and 2004(R) convert an L channel frequency-domain decoded signal L(b,t) and an R channel frequency-domain decoded signal R(b,t) into an L channel time-domain decoded signal L(t) and an R channel time-domain decoded signal R(t), respectively.
FIG. 21 illustrates a structure of the PS decoder 2003 of FIG. 20 in the related art. Based on the principle discussed with reference to FIGS. 16-19, a delay adder 2101 adds a delay to the monophonic audio signal S(b,t) and a decorrelator 2102 decorrelates the delay-added monophonic audio signal S(b,t). A reverberation signal D(b,t) is thus generated.
A PS analyzer 2103 analyzes the PS data, thereby extracting a similarity and an intensity difference from the PS data. As previously discussed with reference to FIG. 18, the similarity is the similarity between the L channel signal and the R channel signal. The similarity is calculated from the L channel input signal and the R channel input signal and then quantized on the decoder. The intensity difference is a power ratio of the L channel signal to the R channel signal. The intensity difference is calculated and then quantized on the encoder.
A coefficient calculator 2104 calculates a coefficient matrix H from the similarity and the intensity difference in accordance with the above-described equation (12). A stereophonic signal generator 2105 generates the stereophonic signals L(b,t) and R(b,t) based on the monophonic audio signal S(b,t), the reverberation signal D(b,t), and the coefficient matrix H in accordance with the above-described equations (11) and (12). Time suffix t is omitted in FIG. 21 and equation (13):L(b)=h11S(b)+h12D(b)R(b)=h21S(b)+h22D(b)  (13)
In one case, the above-described parametric stereophonic system of the related art may receive audio signals having no substantial correlation between an L channel input signal and an R channel input signal, such as two different language voices in encoded form.
In the parametric stereophonic system, a stereophonic signal is generated from a monophonic signal S on a decoder side. As understood from the above-described equation (13), the property of the monophonic signal S affects the output signals L′ and R′.
For example, if an original L channel input signal is completely different from an original R channel input signal (with the similarity being zero), the output audio signal from the PS decoder 2003 of FIG. 20 is calculated in accordance with equation (14):L′(b)=h11S(b)R′(b)=h21S(b)  (14)
In other words, a component of the monophonic signal S appears in the output signals L′ and R′. FIG. 22 diagrammatically illustrates how the component of the monophonic signal S appears. The monophonic signal S is the sum of an L channel input signal L and an R channel input signal R. Equation (14) means that one signals leaks into the other channel.
The parametric stereophonic decoding apparatus of the related art emits similar sounds from the left and right if the output signals L′ and R′ are heard at the same time. The user may hear the similar sound as an echo, with the sound quality degraded.