As quality of life is improved, a requirement for high-quality audio is constantly increased. Compared with mono audio, stereo audio has a sense of orientation and a sense of distribution for each acoustic source, and can improve clarity, intelligibility, and a sense of presence of information. Therefore, stereo audio is highly favored by people.
A time domain stereo encoding and decoding technology is a common stereo encoding and decoding technology in the prior art. In the existing time domain stereo encoding technology, an input signal is usually downmixed into two mono signals in time domain, for example, a Mid/Sid (M/S) encoding method. First, a left channel and a right channel are downmixed into a mid channel and a side channel. The mid channel is 0.5*(L+R), and represents information about a correlation between the two channels, and the side channel is 0.5*(L−R), and represents information about a difference between the two channels, where L represents a left channel signal, and R represents a right channel signal. Then, a mid channel signal and a side channel signal are separately encoded using a mono encoding method. The mid channel signal is usually encoded using a relatively large quantity of bits, and the side channel signal is usually encoded using a relatively small quantity of bits.
When a stereo audio signal is encoded using the existing stereo encoding method, a signal type of the stereo audio signal is not considered, and consequently, a sound image of a synthesized stereo audio signal obtained after encoding is unstable, a drift phenomenon occurs, and encoding quality needs to be improved.