MPEG Surround (MPS) is an audio codec for coding a multi-channel signal, such as a 5.1 channel and a 7.1 channel, which is an encoding and decoding technique for compressing and transmitting the multi-channel signal at a high compression ratio. MPS has a constraint of backward compatibility in encoding and decoding processes. Thus, a bitstream compressed via MPS and transmitted to a decoder is required to satisfy a constraint that the bitstream is reproduced in a mono or stereo format even with a previous audio codec.
Accordingly, even though a number of input channels forming a multi-channel signal increases, a bitstream transmitted to a decoder needs to include an encoded mono signal or stereo signal. The decoder may further receive additional information so as to upmix the mono signal or stereo signal transmitted through the bitstream. The decoder may reconstruct the multi-channel signal from the mono signal or stereo signal using the additional information.
Ultimately, audio compressed in the MPS format represents the mono or stereo format and thus is reproducible even with a general audio codec, not by an MPS decoder, based on backward compatibility.
In recent years, audio-video (AV) equipment is required to process ultrahigh-quality audio. Accordingly, a novel technology for compressing and transmitting ultrahigh-quality audio is needed. For ultrahigh-quality audio, faithful rendering of sound quality and sound field of the original audio is more important than backward compatibility. For instance, 22.2-channel audio, which is for reproducing an ultrahigh-quality audio sound field, needs a high-quality multi-channel coding technique which enables sound quality and sound field effects of the original audio to be rendered even by the decoder as they are, rather than a compression and transmission technique which provides backward compatibility, such as MPS.
MPS is an audio coding technique which is capable of basically processing 5.1-channel audio while providing backward compatibility. Thus, MPS downmixes a multi-channel signal and analyzes the downmixed signal to render a mono signal or stereo signal. Additional information, obtained in the analysis process, is a spatial cue, and the decoder may upmix the mono signal or stereo signal using the spatial cue to reconstruct the original multi-channel signal.
Here, the decoder generates a decorrelated audio signal at upmixing so as to reproduce a sound field rendered by the original multi-channel signal. The decoder may reproduce a sound field effect of the multi-channel signal using the decorrelated audio signal. The decorrelated audio signal is necessary for reproducing a width or depth of the sound field of the original multi-channel signal. The decorrelated audio signal may be generated by applying a filtering operation to the downmixed signal in the mono or stereo format transmitted from an encoder.
A process that the decoder reconstructs 5.1-channel audio using MPS upmixing will be described below. Equation 1 is an upmixing matrix.
                              [                                                                      L                  synth                                                                                                      R                  synth                                                                                                      Ls                  synth                                                                                                      Rs                  synth                                                                                                      C                  synth                                                              ]                =                                            [                                                                                          a                      11                                                                                                  a                      12                                                                                                  a                      13                                                                                                  a                      14                                                                                                  a                      15                                                                                                                                  a                      21                                                                                                  a                      22                                                                                                  a                      23                                                                                                  a                      24                                                                                                  a                      25                                                                                                                                  a                      31                                                                                                  a                      32                                                                                                  a                      33                                                                                                  a                      34                                                                                                  a                      35                                                                                                                                  a                      41                                                                                                  a                      42                                                                                                  a                      43                                                                                                  a                      44                                                                                                  a                      45                                                                                                                                  a                      51                                                                            0                                                        0                                                        0                                                        0                                                              ]                                      ︸                              upmixing                ⁢                                                                  ⁢                matrix                                              ⁡                      [                                                                                m                    0                                                                                                                    dm                    0                    0                                                                                                                    dm                    0                    1                                                                                                                    dm                    0                    2                                                                                                                    dm                    0                    3                                                                        ]                                              [                  Equation          ⁢                                          ⁢          1                ]            
In Equation 1, the upmixing matrix may be generated based on a spatial cue transmitted from the encoder. Inputs of the upmixing matrix include a downmixed signal m0 and signals decorrelated from the downmixed signal, dm0i, generated from {L, R, Ls, Rs, C}. That is, original multi-channel signals {Lsynth, Rsynth, LSsynth, RSsynth} may be reconstructed by applying the upmixing matrix in Equation 1 to the downmixed signal m0 and the decorrelated signals dm0i.
Here, when sound field effects of the original multi-channel signals are reproduced through MPS, a problem may arise. In detail, as described above, the decoder uses a decorrelated signal for reproducing sound field effects of a multi-channel signal. However, since the decorrelated signals are artificially generated from the downmixed signal m0 in the mono format, sound quality of the reconstructed multi-channel signals may deteriorate with higher dependency on the decorrelated signals for the sound field effects of the multi-channel signals.
In particular, when the multi-channel signals are reconstructed by MPS, a plurality of decorrelated signals is needed. When the downmixed signal transmitted from the encoder is a mono format, a plurality of decorrelated signals is necessarily used to render the sound field of the original multi-channel signals from the downmixed signal. Thus, when the original multi-channel signals are reconstructed through mono downmixing, it is possible to achieve compression efficiency and to reproduce the sound field at a certain level, while sound quality may deteriorate.
That is, using the conventional MPS method has a limit in reconstructing an ultrahigh-quality multichannel signal. To overcome such a limit, the encoder may transmit a residual signal to the decoder to replace a decorrelated signal with the residual signal. However, transmitting a residual signal is inefficient in compression efficiency as compared with transmitting the original channel signal.