With the rapid development of computer and communication techniques, communication manners have increasingly changed from single direction to multi-direction for mutual interactions. Such a tendency and a network are widely used and attract a lot of attention in digital communication applications, such as analog signals being converted into digital signals. Digital audio coding and speech synthesis in particular have been more and more important in recent years.
However, the technique of mixing audios is essential to the network meeting. Since digital audio coding is standard for the voice over Internet protocol (VOIP), a small-scale or a large-scale enterprise usually largely utilizes the VOIP to perform a digital coding for network meeting. Unfortunately, the waveform coding must execute a direct coding procedure to complete the audio mixing. There is still a disadvantage of audio transmission in the network.
FIG. 1 shows a view of a network meeting system using half-duplex voice transmission in the prior art. The network meeting system has a computer server 100, a multi-point control unit (MCU), for a control center of meeting procedures. During the network meeting, every speaker talks one-way over a network connection by a microphone (102a–102d). Further, one speaker must wait for another speaker to complete a speech. That is, the speech of the speaker is merely transmitted into the computer server using half-duplex voice transmission by communication equipment 104a–104d, such as a client server, a microphone or network devices (104a–104d).
The computer server 100 then controls the network meeting. An interrupt or a polling procedure is used to process the audios from all speakers. The audios of the speakers must be completely decoded in the computer server 100 to mix the audios. Finally, the decoded audios are entirely encoded again. Therefore, to meet the original format of the audio, the computer server engages in extensive computation and of high complexity to transmit the decoded audios into the client computer.
However, since the audios are conveyed in half-duplex, one speaker 102a only can talk in one period and a participant 102b answers the speaker in the next period. As a result, a voice transmission delay always occurs to reduce the efficiency of the network meeting and communication is not live.
FIG. 2 shows block diagrams of a network meeting system using full-duplex voice transmission in the prior art. The network meeting system has a total decoder 200, a mixer 202 and an audio compression device 204. The audio is completely decoded by the total decoder 200 after receiving the audio. A plurality of decoded audios is obtained and then the decoded audios are synthesized into a mixed audio by the mixer 202 executing a superposition. Finally, the mixed audio is entirely encoded to a mixed audio stream and conveyed to all participants.
For the network meeting system with full-duplex voice transmission, the received audios have to be decoded to an individual audio data to perform an audio mixing. Therefore, the more the participants, the more the decoded and encoded time increases since a total decoder is provided. The computation complexity and transmission delay cause inefficiency in the network meeting. Also, the total decoder increases the overall cost of the network meeting.