In current voice conferencing systems, a speaker selection algorithm in a conferencing bridge detects active speakers and creates an output stream by mixing the audio for the active speakers or active participants. The active stream is then communicated to the participants on the conference call. However, selection of the active speakers involves selecting the most active three or four speakers based on energy levels of voice communications received from the telephony endpoints where the active speakers are located. All other speakers are excluded from the speaker selection algorithm when speech from the three or four active speakers is received.
Only allowing speech from three or four speakers, and excluding all other participants when the three or four speakers are active, may usually work well since three or four speakers is the maximum intelligible number of speakers in a mix; more than this typically results in noise or unintelligible speech on the conference bridge. Thus, conventional speaker selection algorithms by design end up not allowing new speakers to join until one of the existing speakers has been quiet for a while. Although this eliminates interruptions it also precludes new speakers from the opportunity to speak if the active speakers continue to keep speaking. Only when an active speaker is once again quiet does the speaker selection algorithm free up a slot for a new speaker, and the next person to speak will get the freed slot. The next person to speak is not necessarily the person who has been waiting the longest to speak. A person who might have been trying for some time to speak may thus not be provided an opportunity to speak.