With the popularization of voice communication over Internet, voice communication is becoming an indispensable part of user's daily life. For example, conversations in an online chat room or during a game and live broadcasting of a voice on a network all relate to the technology of network voice communication.
To achieve a network voice communication, the following process is to be performed at a side of a voice acquisition device.
1. Voice signals are acquired. This step is to acquire the voice of a user. The voice signal may be acquired via a device such as a microphone.
2. Digital signal processing (DSP) is performed on the voice signal to obtain an encoded voice packet. This step is to process the acquired voice signal, which may include echo cancellation, noise suppress and so on.
In a case that multiple channels of voice signals are acquired, a voice mixing process may be performed before obtaining the encoded voice packet. Other processing about sound effect may also be performed on the voice before obtaining the encoded voice packet.
3. The obtained encoded voice packet is transmitted to a receiving end of the voice.
At present, voice streams are processed with a uniform processing method for different application scenarios. Hence, in a scenario which has a high requirement on voice quality, the requirement on the voice quality can not be met; and in a scenario which has a low requirement on voice quality, resources are wasted since a lot of system resources are occupied. As a result, the current solution in which the voice streams are processed with a uniform processing method can not be adapted to current voice requirements of multiple scenarios.