Communication devices, such as smartphones, tablets, laptops, desktops, wearables, TVs, set-top boxes, and other communication devices employ one or more video andor audio calling applications to allow users of the communication devices to carry out a videoaudio call either in a one-to-one manner, one-to-many manner, or many-to-many manner. Communication networks used with such applications include wireless wide area networks (e.g., cellular networks), IP networks (e.g., the internet), wireless local area networks (e.g., wi-fi networks), and wired networks. A calling user experience over IP networks and other networks, are being challenged by varying network conditions. The video andor audio applications need to be able to adapt to changing networking conditions on the fly.
Audio network adaptation (ANA) techniques attempt to update one audio codec control parameter, for example, a codec bit rate in real time according to network condition changes. For example, the video andor audio calling application receives network condition data, such as bandwidth estimation (BWE), packet loss rate information, and round trip time information, either by testing the network or receiving the network condition information from the network, and outputs changes in the codec control parameter, for example to vary the bit rate during a voice call. In a typical video andor audio calling application, other audio encoding modes, e.g., the discontinuous transmission (DTX) mode, stays the same during a call. Techniques for controlling multiple modes of the encoder or audio codec to make optimal adaptation to real-time network condition are desired. If encoder control is incorrect, jerky motion for video calls can occur, improper audio synchronization with video can occur, as well as loss of conversation information.