This invention relates to conferencing circuits, and more particularly to conference circuits for connection in a digital audio signal transmission system for the transmission of encoded audio data. Such conference circuits may be used in conventional telephony, or in voice over Internet Protocol audio transmission, or for the handling of audio data for video conferencing facilities.
Conferencing circuits are well known in the field of telephony. In general terms, a conference circuit enables three or more participants (or conferees) to talk to one another at the same time. Early conference circuits, employed in analogue telephone systems, provided conferencing by summing all the signals of all the participants and transmitting this resultant signal to all the conferees, with the exception of the talker who receives the resultant signal minus his own signal.
A known approach to conferencing with digital techniques involves converting the digital signals back to the analogue domain, performing an analogue conferencing function, and then re-converting the resultant analogue conference signal into a digital signal. Converting to analogue and re-converting to digital of course adds distortion to the signals involved. It is therefore preferred to perform the summing of signals directly within the digital domain. However, since the digital signals are generally not linear, but rather are encoded using nonlinear pulse code modulation (PCM), it has been proposed to first linearise the digital signals before the subsequent combination of signals, and then re-code those signals, but remaining in the digital domain.
U.S. Pat. No. 4,224,688 describes a digital conference circuit in which the audio signals from a number of conferees are combined directly in PCM format to obtain signals to be provided to the various participants of the conference.
In recent years, increasingly efficient encoding algorithms have been developed for encoding digital audio for transmission across networks, in order to reduce the bandwidth requirements of the network carrying the signals, for example in Voice over IP applications. Generally, these additional encoding algorithms give rise to some degradation of the quality of the voice signal due to the loss of information inherent in lossy compression algorithms. Various standards have evolved concerning the compression of voice data, for example the ITU G.729 voice compression standard which is applicable to video conferencing systems, and the ITU G.723.1 high compression ratio standard. Numerous other standards exist for different system requirements. Generally, these compression techniques are lossy and are very non-linear in nature, so that an analysis of the encoded data does not immediately reveal useful information concerning the sample which has been encoded.
Therefore, conferencing circuits for processing encoded digital audio conventionally decode the encoded voice stream back to the original format, for example to conventional PCM format, to enable comparison of the original signals so that the conferencing functions can be performed. The output of the conference circuit then needs to be re-encoded so that it provides a suitable input to the individual audio devices of the network (for example telephones or computers). The introduction of the conferencing circuit therefore gives rise to additional decoding and encoding, and with lossy compression techniques this so-called xe2x80x9cdouble encodingxe2x80x9d can give rise to a serious reduction in signal quality. This problem arises because the existing conference algorithms can only work in the linear domain for mixing the multiple voice streams.
According to the present invention, there is provided a conference system for connection in a digital audio signal transmission system, for receiving N encoded audio signals from N conferees, wherein N is a positive integer greater than or equal to three, the system comprising:
a decoding system for decoding the N audio signals;
a selection system for selecting one from the N decoded audio signals;
a switching system for switching the selected encoded audio signal from an input stage of the conference system to an output of the conference system, for transmission at least to the Nxe2x88x921 conferees from whom the selected audio signal did not originate.
In the conference system of the invention, decoding of the multiple input signals only needs to be performed to enable the desired comparison of the audio signals to be carried out by an audio selector. Once one audio signal has been selected, the encoded audio signal from an input side of the conference system can be routed to the output without any intermediate decoding and re-encoding, as carried out in the prior art. By xe2x80x9can input stagexe2x80x9d is meant a part of the conference circuit where the input signals have not yet passed through a decoding stage.
As one example, the selection circuit may comprise means for measuring the volume of the decoded audio signals, so that a signal strength measurement dictates which audio signals are provided to the conferees.
These features of the invention enable a single audio signal to be selected for transmission to all participants of the conference (apart from the originator of the audio signal).
However, the circuit may additionally comprise a conferencing entity for combining selected ones of the decoded audio signals;
an encoding unit for re-encoding the combined decoded audio signals; and
means for switching the combined re-encoded audio signal to the output of the conference system, to be used in place of the single audio signal selected previously, for a period.
This additional feature enables a combination of audio signals to be provided to the participants of the conference, when this is appropriate. A control unit is preferably provided for determining whether a combination of re-encoded signals or a single encoded signal is appropriate, and this determination may be based on the relative volumes of the audio signals. For example, if there is a dialogue between two participants simultaneously and with approximately equal volume (for example an argument) it will be appropriate for all participants to hear both of these parties.
Informational tones (such as conference warning tones or barge-in tones, indicating that an operator has joined the call) or other tones would also require to be conferenced to all parties. A tone generation system may therefore provide data to the selection system for this purpose. The features above enable the conference system to revert to a conventional approach in such circumstances, whereas during the majority of the conference proceedings, a single audio signal may be provided to the participants, which has not been subjected to decoding and re-encoding.
To perform the functions described above, the control unit preferably comprises means for determining the audio signal with the greatest volume, and for measuring the volume difference between that audio signal and the remaining audio signals, which will be combined by the conference entity, and passed to the output if the volume difference between the two signals of greatest volume is below a threshold.
The invention also provides a method of providing a conference facility for N conferees communicating over a digital audio signal transmission system, wherein N is a positive integer greater than or equal to three, the method comprising:
providing encoded audio signals from the N conferees to a conference system;
decoding the N audio signals;
selecting at least one of the audio signals by analysing the N decoded audio signals, wherein if only one audio signal is selected, the encoded audio signal for the selected signal is switched to the output of the conference system and transmitted to at least the Nxe2x88x921 conferees from whom the selected audio signal did not originate.
As described above, the step of selecting at least one audio signal may be based on the volume of the signals, and the relative volumes will dictate whether a single audio signal is switched to the output of the conference circuit or whether a combination of signals, having undergone decoding and subsequent re-encoding, is instead provided to the output.
There are, of course, various alternatives to conference algorithms based on volume determination, and these may equally be employed.