Among the current solutions for realizing a multipoint conference, a speaker of the conference cannot receive his/her own voice, but the other participants can hear the speaker.
In a specific implementation, a multipoint conference is realized by using a multipoint control unit (MCU) that serves as a core component. The MCU functions as a switch, but it is different from a switch configured in a common telephone network. The MCU switches image, audio, and data signals, i.e., switches data streams, rather than analog signals.
The MCU processes video signals by direct distribution, processes data signals by broadcasting, and processes audio signals according to the following two circumstances. In the case of only one speaker, the MCU switches an audio signal of the speaker to the other participants. In the case of a plurality of speakers, the MCU mixes audio signals of all the speakers, selects an audio signal with the highest level, and then switches the audio signal to all the other participants except the speaker corresponding to the highest level. Currently, the MCU supports the mixing of voices of at most six speakers. When more than six speakers exist, six speakers with the highest voices, i.e., having the highest levels, are selected from the speakers, and the MCU mixes the six voices and then sends the mixed voices to the participants, so that each participant can receive voices of the other participants except the participant himself/herself.
In the above implementation, after the MCU establishes communication with each participant, code stream channels are respectively opened between the MCU and the participants. Currently, the basic code stream channels include audio code stream channels and video code stream channels. In the code stream channels, code streams are transmitted bi-directionally. The internal audio processing of the MCU is divided into three parts, namely, a decoding part, an audio mixing part, and an encoding part. Code streams of all participants need to be transmitted to the decoding part for being decoded, and then transmitted to the audio mixing part for being mixed. Afterwards, the mixed code streams are transmitted to the encoding part for being encoded, and then, the encoded code streams are sent to corresponding participants. The decoding processing includes calculating a volume, i.e., level, of the audio code stream, and meanwhile generating the code stream for performing the audio mixing. The audio mixing part acquires audio code streams for performing the audio mixing according to the volumes of the code streams. By taking a network structure shown in FIG. 1 as an example, assuming that an MCU in this example supports the mixing of at most three audio code streams, and volumes of audio code streams satisfy A>B>C>D>E, a corresponding relation of audio mixing is shown in Table 1.
TABLE 1Participant of voiceParticipant of voicedestinationsourceMixing resultABCDEBCDBACDEACDCABDEABDDABCEABCEABCDABC
The audio mixing part performs the audio mixing on three participants with the highest voices among the participants of audio source according to the corresponding relations listed in Table 1 to generate an audio code stream, and then sends the audio code stream to the encoding part for being encoded. Afterwards, the encoding part encodes the audio code stream to generate an audio code stream for being sent to a voice destination participant and then sends the audio code stream. Finally, the results received by the participants are as follows: A hears voices of BCD, B hears voices of ACD, C hears voices of ABD, D hears voices of ABC, and E hears voices of ABC.
As seen from the above analysis, in the above technology, the audio code streams supported by the MCU are mixed first, and then the mixed audio code streams are sent to all the other conference participants except the participants of the audio code stream sources. As a result, a part of the participants in the conference cannot communicate privately without being known by the other participants. Because the audio code streams for performing the audio mixing are either received by all participants, or received by none of them. Therefore, through the above technology, a part of the participants cannot realize a small-group communication without being received by the other participants and without affecting the original ongoing multipoint conference.
As one improving manner, when a part of the participants intend to have an internal discussion, the conference may be divided into several group meetings, which will be recombined into one conference after the discussion. As shown in FIG. 2, A, B, C, D, and E participate in a conference. During the conference, when a certain issue requires group discussion, the participants A, B, and E are classified into one group, and the participants C and D are classified into the other group. The MCU mixes audio code streams of the participants in the two groups respectively, so that the audio code streams of members in one group are not mixed into the audio code streams of members in the other group, i.e., audio code streams of different groups are mixed respectively. After the group discussion is finished, the two groups are recombined into one conference. During the classification process, A, B, and E are classified into one group meeting, C and D are classified into the other group meeting, and the two group meetings do not affect each other. Voices of one group meeting are not received by the participants of the other group meeting.
Although the above improving manner enables a part of the participants to have an internal discussion, it still fails to achieve a private session without affecting the original multipoint conference, as the original multipoint conference is interrupted due to the internal discussion. Furthermore, in the above solution, different groups cannot know the content discussed by each other. Because the above manner can only be adapted to classify all the participants as a whole into different groups for group discussion, the group classification cannot be performed unless all the participants agree to organize the group discussion, so that a private discussion among a part of the participants still cannot be achieved.
In conclusion, the conventional art cannot realize a private session for a part of participants in a multipoint conference, during which the participants can continue to hear the content of the multipoint conference, and the content of the private session is not sent to those who do not participate in the private session.