The present invention relates to a television conference system realizing a television conference by use of a plurality of terminal devices placed at multiple points, and in particular, to a video switching control technique which is employed when video images displayed by the terminal devices are switched by identifying a speaker (participant of a conference who is currently speaking).
In a technique used in a conventional television conference system, the speaker who is currently speaking is identified based on sounds picked up by microphones of the terminal devices respectively placed at multiple points, and the video images are switched to those of the speaker who is currently speaking. An example of the conventional television conference system is disclosed in Japanese Patent Provisional Publication No. HEI 05-111020.
In general, as the number of participants of a conference increases, it becomes more difficult for each participant to identify the speaker currently speaking based on the sound only. Therefore, the video switching technique, enabling the participants to easily grasp who is the speaker by the switching of video images in response to speech (remark, comment, response, etc.) of each speaker, has become extremely useful.
The above television conference system is provided with a time setting module for setting the timing of the switching of the screen (video). When the screen is switched, the pre-switching state is held for a time period which is set by the time setting module.
However, it is not preferable that the images are switched excessively frequently. The television conference system of the publication indicated above is configured to avoid excessively frequent video switching by maintaining a pre-switching state (state before the switching) for a preset time period when the image as displayed is switched. That is, the images are switched only when the new speaker is identified and the preset time period has elapsed.
In the television conference system disclosed in the above-indicated publication, therefore, the excessive switching can be prevented. However, the image before switching is kept for the preset time period at every switching, and thus the participants cannot view the images of the current speaker at the beginning of the speech of the current speaker.