This invention relates generally to teleconferencing systems and more particularly to systems for encoding speech of multiple speakers for transmission on a limited bandwidth such as ISDN or DSL telephone communication systems.
As is known in the art, teleconferencing systems are in wide use today to enable people at remote locations to communicate through the PSTN. The bandwidth of a normal POTS PTSN has been limited to about 4 kHz. Today, wider bandwidths are possible on the PTSN using ISDN or DSL, such as ADSL.
In many applications, it is desirable to have better than so-called xe2x80x9ctoll quality speechxe2x80x9d (i.e., that which is provided on a POTS having a 4 kHz limited bandwidth). In these applications, the participants need to hear fricatives and plosive utterances that require frequencies above 4 kHz. Some examples would be where the speaker has a strong accent, or is speaking about technical topics in which accuracy is important, or is teaching a foreign language.
In accordance with the present invention, a teleconferencing system is provided having a predetermined, limited (i.e. available) bandwidth. The system includes a dominant speaker detector, for determining which of a plurality of participants is at least one dominant speaker. A bandwidth allocator is provided for allocating a first portion of the available bandwidth to the at least one determined dominant speaker and allocating a second portion of the available bandwidth to one or more of the non-dominant speaker participants.
In accordance with one embodiment of the invention, the first portion is wider than the second portion.
With such system, the speech of multiple speakers is encoded for transmission on a limited bandwidth medium, such as, for example, ISDN or ADSL, with selected at least one of the speakers being transmitted on different at least one portion of the limited bandwidth. The system encodes one or more xe2x80x9cdominantxe2x80x9d speakers on a first channel, and merges all other speakers into a second channel. A xe2x80x9cdominantxe2x80x9d speaker may be defined according to the needs of the application, but intuitively it is a speaker in a collaboration session who has the xe2x80x9cfloorxe2x80x9d. Each speaker at any moment is a candidate to become xe2x80x9cdominantxe2x80x9d, replacing a current dominant speaker, xe2x80x9cspeakingxe2x80x9d is distinguished from audible attempts to interrupt the current speaker. The utterances of non-dominant speakers may not be intelligible when merged together on a common second channel, but can still serve their conversational purpose of signaling that another speaker wants the xe2x80x9cfloorxe2x80x9d. This approach would be most suitable for conferences which will have multiple speakers who take turns, but can also cover cases in which there are more than one dominant speaker.
In one embodiment, the first speaker to break silence is placed on a wideband channel and if someone else talks at the same time the second speaker is placed on a new channel. The new channel, may be another wideband channel or a narrowband channel. The process repeats until the available bandwidth is consumed.
In one embodiment, speakers who cannot be accommodated on individual wideband channels are summed onto a single narrowband channel.
In accordance with another feature of the invention, a teleconferencing system is provided having an available bandwidth. The system includes a plurality of microphones, each one being associated with a corresponding one of a plurality of participants. A dominant speaker detector is responsive to signals produced by the microphones and determines which of a plurality of participants is a dominant speaker. A bandwidth allocator allocates a wider portion of the available bandwidth to the detected dominant speaker and a narrower portion of the available bandwidth to a non-dominant speaker participant.
In one embodiment, the teleconferencing system dominant speaker detector responds to the speech from the one, or ones, of the speaking participants passed by the speaker detector, determines which of the detected speaking one, or ones, thereof is a dominant speaker, and produces a signal indication of the one of the participants determined to be the dominant speaker. The system transmits a speaker code to indicate the one of the participants determined who is the dominant speaker. This speaker code can be inserted into the header of one or more of the audio packets transmitted on the medium, or transmitted in a separate speaker code channel. The speaker code can be used by a client program to produce a visual indication of the speaker.
In accordance with another feature of the invention, a method is provided for transmitted speech through a teleconferencing system having an available bandwidth. The method includes: determining which of a plurality of participants is a dominant speaker; and, allocating a first portion of the available bandwidth to the detected dominant speaker and allocating a non-dominant participant into a second bandwidth portion of the available bandwidth.
In one embodiment, the method includes merging non-dominant speaker ones of the participants into a common portion of the available bandwidth.