A video conferencing service is a multimedia communication service. It uses video conferencing terminals and communication networks to have a conference and can implement the exchange of images, speech, and data between two sites or among multiple sites simultaneously. A terminal at a site compresses and encodes signals of images shot by a local camera and voice signals of participants that are collected by a microphone in a participant area, and transmits the signals to a remote site via a transmission network. At the same time, the terminal receives digital signals transmitted from the remote site via the transmission network, and decodes the digital signals to obtain images and signals of a participant at the remote site. With the development of video conferencing, the site has developed, from a site including one camera, one monitor, and one participant area in the past, to a site including multiple cameras, multiple monitors, and multiple participant areas. The multiple cameras, multiple monitors, and multiple participant areas at the same site are associated in a physical or logical relationship.
In a video conference, because discussion regarding conference content is required sometimes, a conference content image needs to be displayed on the monitor of a site. As shown in FIG. 1, site A sends a conference content image to terminals of remote sites B and C through auxiliary stream channels, and the terminals of site B and site C display the conference content image on local monitors upon reception. The prior art provides an auxiliary stream transmission mode based on a token. Specifically, one conference has only one auxiliary stream token, and a site obtaining the token sends an auxiliary stream, and participants at all the sites watch an auxiliary stream image of the site having the token.
The prior art has the following disadvantage:
In some video conferences, users may require watching the projection of objects and the conference content image simultaneously. In this case, two auxiliary streams are required, one for transmitting an object projection image, and the other for transmitting the conference content image. However, in the prior art, one conference has only one auxiliary stream token, and the auxiliary stream token can be bound with only one auxiliary stream, so the participants at the sites cannot view the object projection and the conference content image simultaneously.