A multi-channel signal is widely applied to various scenarios, such as a telephone conference and a game, and more and more emphasis is put on encoding/decoding of the multi-channel signal. When encoding the multi-channel signal, conventional encoders based on waveform encoding, such as Moving Pictures Experts Group (MPEG)-L II, Moving Picture Experts Group Audio Layer III (mp 3) and Advanced Audio Coding (AAC), all independently encode each channel. This encoding method may well restore the multi-channel signal, but the required bandwidth and encoding code rate are several times of those for a mono-channel signal.
The stereo or multi-channel encoding technology is parameter stereo encoding, which may reestablish a multi-channel signal whose acoustic feeling is completely the same as that for the original signal by utilizing a little bandwidth. The basic idea of the parameter stereo encoding is as follows. At an encoding end, a multi-channel signal is down-mixed into a mono-channel signal, and the mono-channel signal is independently encoded, meanwhile channel parameters between channels are extracted, and then these channel parameters are encoded. At a decoding end, firstly the down-mixed mono-channel signal is decoded, then the channel parameters between the channels are decoded, and finally these channel parameters together with the down-mixed mono-channel signal are utilized to synthesize a multi-channel signal.
In the parameter stereo encoding, channel parameters generally used for describing interrelations between channels include an inter-channel time difference parameter (that is, channel delay parameter), an inter-channel amplitude difference parameter and an inter-channel correlation parameter. The channel delay parameter represents a delay relationship between channels, and plays an important role of positioning the location of a speaker.
Taking a stereo signal as an example, a solution for transmitting a multi-channel signal in the prior art is as follows: a channel delay parameter between a left channel and a right channel is extracted by utilizing a correlation between the stereo left channel signal and the stereo right channel signal, and at the encoding end, delay adjustment is performed on the left/right channel signals of the stereo signal, which needs to be transmitted, by utilizing the channel delay parameter, thereby eliminating the delay difference between the two channels. Then, the left/right channel signals, which are obtained after the delay adjustment, are added in the time domain to obtain a down-mixed M signal (sum signal), and the left/right channel signals, which are obtained after the delay adjustment, are subtracted from each other in the time domain to obtain a down-mixed S signal (edge signal).
Then, according to the M signal and the S signal, other channel parameters are extracted, such as an energy ratio between the left channel and the right channel or an inter-channel amplitude difference parameter. At the encoding end, the channel parameters are encoded for transmission, and the M signal is encoded for transmission in the mono-channel manner. At the decoding end, firstly an M signal is reconstructed, and then according to the received channel delay parameter, a delay operation reverse to that for the encoding end is performed on each channel of the M signal, so as to reconstruct the transmitted stereo signal. Therefore, on the basis of transmitting a mono-channel signal, as long as a few code rate resources are provided to transmit channel parameters, a stereo signal may be reconstructed at the decoding end.
In the implementation of the present invention, the inventors find that at least the following problems exist in the prior art. In the prior art, a comb filtering effect may occur in a processed signal that is obtained after down-mixing processing (including: an M signal and an S signal), that is, a signal frequency domain amplitude in some particular frequency bands of at least one of the M signal and the S signal is greatly attenuated, and a signal frequency domain amplitude in some particular frequency bands is strengthened. The comb filtering effect deteriorates the quality of the processed signal, thereby affecting the quality of the reconstructed multi-channel signal.