Improvement of life quality leads to increasing personal demands for high quality audio. Compared with monaural audio, stereo audio can improve definition and intelligibility of information, and therefore, is popular among people.
When stereo audio is processed in the prior art, an input stereo audio signal is parsed first, an inter-channel level difference (ICLD) value of each sub-band in a frame that carries the stereo audio signal is obtained, and then the obtained ICLD value is compared with obtained ICLD values of previous frames. When a difference between the ICLD value and the obtained ICLD values of the previous frames is great, the stereo audio signal carried by the frame is Transient; or otherwise, the stereo audio signal carried by the frame is Normal. For Transient, two frames are used for transmission, that is, an ICLD of an odd-numbered sub-band and an ICLD of an even-numbered sub-band are separately transmitted. For Normal, four frames are used for transmission, that is, each frame transmits an ICLD of a quarter of a sub-band. To ensure consistency in a quantity of bits, refinement processing is further performed on Normal.
However, when stereo audio is processed using the prior art, because a frame that carries the stereo audio is relatively long, when 10 milliseconds (ms) of stereo audio is processed, if Normal is processed using four frames, it is equivalent to that an ICLD is updated every 40 ms (4*10 ms), which cannot ensure quality of decoded stereo audio in a case that a signal changes quickly or in a case of packet loss, and in addition, if the ICLD is transmitted frame by frame, low bit-rate transmission of a stereo audio signal cannot be implemented.