A modern conference session, also abbreviated as conference, can be established by a mixing unit in the form of a conference bridge or Media Streamer. The Media Streamer executes an application for controlling of a conference which can be defined as a program, in particular a computer program or application software that allows an administrator to control the conference. When the application for controlling a conference is running on a computer, the application is able to provide a mixture of speech signals from participants, also called users, of the conference. The application for controlling the conference can be installed on a personal computer, abbreviated to PC, and/or run on the PC. Such a PC is also referred to as the Media Streamer, a media server or application server. In the following, besides to a computer on which the application is installed to control the conference, so for example, the Media Streamer, media server or application server, the application for controlling the conference itself is called Media Server. To that extent, in the following, the term “Media Streamer”, which is also called “conference server”, is equally used for execution of the application software for controlling the conference in the form of software, and in a form of this application in hardware. The Media Streamer is set up to receive as a server from each of the communication terminals of the conference participants the respective audio/video signals and to transmit the mixed audio/video signals to communication terminals of the conference participants. There is a difference such that for active participants all except the own image/voice is mixed individually by the conference unit, whereas for passive participants in streaming mode all passive users receive the same images/voice. Therefore the streaming mode is advantageous in large conferences because the processing power of the conference unit is significantly reduced compared to a case where for each participant all except the own image/voice is mixed individually by the conference unit. As a communication terminal of a participant may act a telephone unit, an IP Phone (IP: internet Protocol) or a PC client, wherein another communication terminal, such as a mobile telephone or another server, is possible.
Under a conference session it is in particular understood a conference in which at least two participants of the conference are not resident at a same place/location such that they cannot communicate with each other without the use of technical means. The communication of the participants will rather be executed via the mixing unit by mixing the voice signals of the participants, wherein said conference can be configured for example as a teleconference or videoconference. In a teleconference, participants communicate only by exchange of speech regardless of how the voice signals of the participants are transferred. Therefore, both a teleconference over a landline and a teleconference in which one or more participants communicate with each other over a cellular network are called a teleconference.
In addition, a conference in the form of a video conference is possible with image signals of the participants being transmitted in real-time to other participants in addition to the exchange of voice signals of the participants. In the following, however, a conference is also meant to comprise an application sharing wherein other media are to be exchanged between the participants of the conference in addition to the exchange of voice and video data of the participants, for example in the form of a transfer of data between the participants. This data can be shifted/delayed in time with respect to the real-time data of the voice and/or image signals of the participants and can be displayed on a screen, for example the screen of a personal computer. In general, the mixing unit in form of a Media Streamer can be connected via a network, for example, the intranet or the Internet, to the communication terminals of the participants of the conference. In this case, the voice and/or video and/or data signals are transferred in the form of data packets from one participant to another participant in the conference.
In a telephone conversation for example in a conference session, participants often activate mute for preventing persons from hearing background noise from their desk or in order to discuss another issue while participating in the conference session. This mute mode can be deactivated when the user presses an unmute or mute-off button. However there are times when a user forgets to press the mute-off button and by the time the participant starts to talk mute is still on. Then other participants of the conference are not able to listen to the talking participant until the talking participant realizes to unmute or mute-off and repeat the content already spoken. While there are mechanisms that detect voice activity and automatically switch-off the mute button, these highly sophisticated mechanisms require some time until the unmute has been activated for the voice of the formerly muted participant to be transmitted to the other participants. Under optimal circumstances the response behavior of this voice activity recognition and subsequent automatic unmute may require in the order of 2 to 3 seconds in which some useful information of the muted talking participant may be lost. It is therefore desired to better reduce a loss of information when a user of a conference session forgets to unmute.
A similar problem arises for a conference session where a single or few participants are actively participating and a large group of other participants is passive, i.e. listening only to the subject matter of the single or few actively participating participants. Such a case may arise in panel discussions or webinars. Active participants served via a fast communication channel may be served via a fast voice and/or video conferencing channel. Passive participants may be served vis a different slower voice and/or video conferencing channel leading to a delay in receiving the data for the passive participant when compared to a same point-in-time of reception of the data by an active participant. The delay is not critical for the passive participant as long as the participant stays passive. However, a passive participant may want do an utterance or temporarily take part in the discussion of the active participants, typically introduced by a so called “raise hand” or similar indication. After this indication, the administrator or moderator may turn a passive participant into an active participant with a notification. Thus, the participant that indicated to take part in the discussion of the active participants gets connected via real-time to a point-in-time of the discussion he is not aware of because of the streaming delay prohibiting this participant to catch up to point-in-time of the real-time discussion. This situation is comparable to the situation as described for a muted participant forgetting to unmute before starting to talk. It is thus desired to better reduce a loss of information when a passive user receiving delayed data with respect to an active user of a conference session is turned into an active user. It is therefore the object of the invention to provide a method which better reduces a loss of information when a passive user of a conference session is turned into an active user.