The existing conventional videoconferencing system is a system providing video/audio exchange based on H.323, H.320, and Session Initiation Protocol (SIP) standard protocols, typical use of which lies in adopting one channel video/audio input and one channel video/audio output. With the development of the technology, a telepresence conference system is put forward, which generally implements transmission of multiple video streams by binding multiple videoconferencing terminals. FIG. 1 is a typical schematic structure diagram of a telepresence site in a telepresence conference system in the prior art. As shown in FIG. 1, a camera group includes a camera 101, a camera 102, and a camera 103 from left to right in sequence; generally, each camera is one-to-one corresponding to each terminal, and each terminal is connected to a display at a corresponding user area. For example, the camera 101, a terminal 104, a display 107, and a user area 110 may form a first group of corresponding relationship; the camera 102, a terminal 105, a display 108, and a user area 111 may form a second group of corresponding relationship; and the camera 103, a terminal 106, a display 109, and a user area 112 may form a third group of corresponding relationship. That is, the devices included in each group send and receive corresponding images, encode and decode the corresponding images, and display the corresponding images.
During the implementation of the present invention, the inventor finds that the prior art at least has the following disadvantages. The application that the existing multi-display telepresence conference system implements transmission of multiple video streams by binding multiple videoconferencing terminals is limited to a point-to-point scenario, so the problem of controlling the telepresence conference between multiple points occurs with the popularity of the telepresence conference system between multiple points. The conference control manner of the existing telepresence conference system between multiple points is simple; especially, the telepresence site side of the telepresence conference system can only perform the control over the site thereof, but cannot perform the control over other sites in the telepresence conference system, and users in the site thereof cannot choose to view other sites arbitrarily through the terminal. Besides, for a hybrid system of the conventional videoconferencing system and the telepresence conference system, cross-conference control cannot be implemented between the two conference systems.