1. Field of the Invention
The present invention relates to a video implementation method for a three-party video conference.
2. Description of the Related Art
Audio conference facilities are the majority of three-party conference functions supported in current market. As to video conference facilities, due to high-demanded hardware for realizing function algorithm of three-party video conference, video conference facilities generally have increased total cost, much higher than that of the audio conference facilities.
In traditional three-party conference facilities, it is known that, after receiving a mixed real-time transport protocol (RTP) data transmitted from the two parties, a host party transmits data to the two parties for realizing the three-party conference, in which the audio and video realization methods are individually different from each other. The audio and video realization types of the three-party conference function are described as follows.
Realization Method for Traditional Audio Conference
As shown in FIG. 1, no direct data transmission is formed between the two parties ‘A’ and ‘C’, i.e., with the host party served as an intermediate station, the data of the host party ‘B’ together with the data of the third party are mixed and transmitted to the second party, and it is understood that the host party ‘B’ plays a very important role on processing data. Furthermore, because the processing of the audio data is relatively simple, it is basically that two threads are enough for the data mixing process. The three-party audio conference of the host party ‘B’ includes the steps of:
in step (1), constructing sockets of RTP receive-in ports for the two parties ‘A’ and ‘C’ and performing monitoring and waiting by the host party ‘B’;
in step (2), using the host party ‘B’ to sample audio data thereof;
in step (3), receiving RTP audio data transmitted from the party ‘A’ by the host party ‘B’, outputting the audio data through a speaker after the received RTP audio data are decoded, and transmitting a RTP data packet which is packed from the mixed data of the decoded audio data and the audio data of the host party ‘B’ to the RTP receive-in ports of the party ‘C’ while negotiating network parameters;
in step (4) which is simultaneously performed with the step (3), receiving RTP audio data transmitted from the party ‘C’ by the host party ‘B’, outputting the audio data through the speaker after the received RTP audio data are decoded, and transmitting a RTP data packet which is packed from the mixed data of the decoded audio data and the audio data of the host party ‘B’ to the RTP receive-in ports of the party ‘A’ while negotiating network parameters; and
in step (5), forming a three-party audio conference, in which the mixed voice of the two parties ‘B/C’ can be heard from the party ‘A’, the mixed voice of the two parties ‘A/C’ can be heard from the party ‘B’, and the mixed voice of the two parties ‘A/B’ can be heard from the party ‘C’.
Realization Method for Traditional Video Conference
Regardless of hardware cost, the video conference is actually feasible according to the method of the above-described audio conference, and the effect of one party observing the other party and the host party can be achieved. However, due to high-demanded hardware for realizing three-party video conference, the majority of video telephone sets are not provided with this effect in the three-party video conference. Alternatively, in the specification of the common telephone sets, each of three parties can hear from the other two parties' voices, each of the conference-participant parties can observe the video of the host party, and the host party can observe the video of one of the conference-participant parties.
In the three-party video conference, the data mixing process has the most resource consumption and complication. As shown in FIG. 2, the host party ‘B’ eliminates the video data mixing process, and the ways for processing data of the two parties ‘A/C’ by the host party ‘B’ are distinct. As to the video calls, due to the CPU resource consumption in the data receiving process relatively less than that in the data decoding process, it is most effective to save resource by reducing the decoding process, the decoding process to the non-current activated data of the party ‘C’ is eliminated, and the critical frame is preserved for the decoding normality in the subsequent reviewing process. At this moment, the CPU resource consumption is the sum of three-party audio conference, single-channel pure video call, and single-channel RTP data receiving process, and it is basically that audio mixing resource consumption is additionally provided in comparison with the single-channel audio/video call.
The three-party video conference of the host party ‘B’ includes the steps of (referring to the above-described implementation method of the audio conference for the audio processing steps):
in step (1), constructing sockets of RTP receive-in ports for the two parties ‘A’ and ‘C’ and performing monitoring and waiting by the host party ‘B’;
in step (2), using the host party ‘B’ to sample and pack video picture thereof into a RTP data packet to be respectively transmitted to the RTP receive-in ports of the parties ‘A’ and ‘C’ while negotiating network parameters;
in step (3), receiving RTP data transmitted from the party ‘A’ by the host party ‘B’ and displaying the decoded RTP data on FrameBuffer of the monitor to be observed by the client if the video of the party ‘A’ configured by the host party ‘B’ is presumed as a current displaying main video;
in step (4) which is simultaneously performed with the step (3), receiving RTP data transmitted from the party ‘C’ and merely preserving I frame data of critical frame without decoding the received RTP data by the host party ‘B’, in which the I frame data of critical frame is utilized to perform decoding compensation when switching the video of the party ‘C’ as the main video so as to prevent mosaic phenomenon appeared when the video of the party ‘C’ is initially displayed;
in step (5), requesting the party ‘C’ to resend I frame by the host party ‘B’ by utilizing communication control protocol (SIP protocol or self-defined protocol) if the video of the party ‘C’ configured by the host party ‘B’ is presumed as the main video, and updating the video picture and processing the data of the parties A and C for switching process.
Although the above-described measure can solve the problem of resource shortage, several problems still exist as following:
(A) The most important principle cannot be achieved because one party cannot observe the video information of another participant party, but three parties can hear from each other;
(B) The host party generally can only observe the video of one of the participant parties. It is inconvenient to further perform a switching process if the host party wants to observe the video of the other participant party.