Transmission of voice (also referred to as speech) and music by digital techniques has become widespread and incorporated into a wide range of devices, including, wireless communication devices, personal digital assistants (PDAs), laptop computers, desktop computers, mobile and/or satellite radio telephones, and the like. An exemplary field is wireless communications. The field of wireless communications has many applications including, e.g., cordless telephones, paging, wireless local loops, wireless telephony such as cellular and PCS telephone systems, mobile Internet Protocol (IP) telephony, and satellite communication systems.
In telecommunications networks, information is transferred in an encoded form between a transmitting communication device and a receiving communication device. The transmitting communication device encodes original information, such as voice signals and/or music signals, into encoded information and sends it to the receiving communication device. The receiving communication device decodes the received encoded information to recreate the original information. The encoding and decoding is performed using codecs. The encoding of voice signals and/or music signals is performed in a codec located in the transmitting communication device, and the decoding is performed in a codec located in the receiving communication device.
In modern codecs, multiple coding modes are included to handle different types of input sources, such as speech, music, and mixed content. For optimal performance, the optimal coding mode for each frame of the input signal should be selected and used. Accurate classification is necessary for selecting the most efficient coding schemes and achieving the lowest data rate.
This classification can be carried out in an open-loop manner to save complexity. In this case, the optimal mode classifier should take major features of the various coding modes into account. Some modes (such as speech coding modes like algebraic code excited linear prediction (ACELP)) contain an adaptive codebook (ACB) that exploits correlation between the past and current frames. Some other modes (such as modified discrete cosine transform (MDCT) coding modes for music/audio) may not contain such a feature. Thus, it is important to ensure that input frames having high correlation with the previous frame are classified into the mode which has ACB or that includes other inter-frame correlation modeling techniques.
Previous solutions have used closed-loop mode decisions (e.g., AMR-WB+, USAC) or various types of open-loop decisions (e.g., AMR-WB+, EVRC-WB), but these solutions are either complex or their performances have been prone to errors.