The invention relates to a conference system comprising: a plurality of speaker units to be arranged in one conference space, a central unit coupled to the speaker units, at least one of the speaker units comprising:
a microphone for generating a microphone signal, PA1 a speech signal output for supplying a speech signal to the central unit in response to the microphone signal, PA1 a listening signal input for receiving a common listening signal from the central unit, and PA1 a loudspeaker for the acoustic reproduction of the common listening signal. PA1 a differential stage having a first input for receiving the microphone signal and a second input for receiving a compensation signal, and having an output coupled to the speech signal output to supply a compensated microphone signal in response to the difference between the microphone signal and the compensation signal, and PA1 an adaptive filter having a signal input for receiving the common listening signal, having a control input for receiving the compensated microphone signal, and having a signal output for supplying the compensation signal, the adaptive filter having an impulse response which is an estimate of the impulse response of a short echo path as a result of a direct acoustic coupling between the loudspeaker and the microphone of the relevant speaker unit and of an indirect acoustic coupling between the loudspeaker and the microphone of the relevant speaker unit via objects in the proximity of the speaker unit, in which estimate the impulse response of a long echo path as a result of an acoustic coupling between all the loudspeakers of all the speaker units and the microphone of the relevant speaker unit via the bounding surfaces of the conference space is ignored. PA1 a status signal input for receiving from the central unit a status signal for signalling a speech status or a listening status to the speaker unit, and PA1 first coupling means for coupling the speech signal output to the microphone signal when the status signal indicates the speech status and to the compensated microphone signal when the status signal indicates the listening status. PA1 a first decimator for reducing the first sampling rate of the microphone signal to a second sampling rate which is a predetermined decimation factor lower than the first sampling rate, and for supplying a decimated microphone signal to the first input of the differential stage, PA1 a second decimator for reducing the first sampling rate of the common listening signal to the second sampling rate and for supplying a decimated common listening signal to the signal input of the adaptive filter, PA1 an interpolator for increasing the second sampling rate of the compensated microphone signal at the output of the differential stage and for supplying the compensated microphone signal with the first sampling rate. PA1 a comparator for comparing a power value of the microphone signal with a power value of the compensated microphone signal and for supplying a switching signal if the power value of the compensated microphone signal exceeds the power value of the microphone signal, PA1 second coupling means for replacing the compensated microphone signal by the microphone signal in response to the switching signal.
Such a system is known from European Patent Specification EP 0,191,492. Such a conference system, also referred to as a congress system, meeting system or discussion system, serves to improve the intelligibility of speech of the participants in a meeting held in one space, for example a room or a hall. For this purpose, the participants are seated near the speaker units and speak into the microphone of the speaker unit. The microphone signal is available at the speech signal output of the speaker unit. The speaker units are coupled to the central unit in which the speech signals from the speaker units can be selected and added to form the common listening signal, which is transferred to the loudspeakers of the speaker units. In order obtain a maximal system gain, only the speech signals from those participants who are speaking are selected and added and, moreover, the transfer of the common listening signal to the loudspeakers of the relevant speaker units is interrupted to preclude acoustic feedback. In the prior-art conference system, selection is based on indication signals produced by means of push-buttons on the speaker units. Since the participants often forget to actuate the push-button there is a need for an automatic speaker detection system.
In the central unit, it is possible to compare the signal levels of all the speech signals with the average speech signal level. A speaker is then detected in that the level of his speech signal is higher than the average level. As a result of the direct acoustic coupling between the loudspeaker and the microphone of the non-speaking speaker units this average level is comparatively high. Owing to this high average level as well as the required margin the speaker signal should be fairly large to exceed the average level. As a result of this, in particular the beginnings of sentences and words are lost.
Another speaker-detection possibility is known from loudspeaking telephony. The speaker units are then located in different spaces. Such telephone conferencing systems also require speaker detection to control the so-called voice switch, necessary to prevent acoustic feedback. This is accomplished by the use of an echo canceller, which comprises a filter in which the listening signal is converted into a signal which is an estimate of the microphone signal. The microphone signal and the estimated signal are subtracted from one another. Speech is then detected when the actual microphone signal deviates from the estimated signal as a result of the contribution of the speaker's voice to the microphone signal. The filter is often an adaptive filter having an impulse response corresponding to the acoustic impulse response of the space in which the speaker unit is situated. This acoustic impulse response is unknown a priori and may change. It requires a very complex adaptive filter having a long impulse response to allow a correct operation under all possible operating conditions. In this respect reference is made to: W. Armbruster, "High Quality Hands-Free Telephony using Voice Switching Optimised with Echo Cancellation", Signal Processing IV: Theories and Applications, Elsevier, EURASIP, 1988. However, the use of the known echo cancellers for the purpose of speaker detection in a conference system has the drawback that the complex adaptive filters are expensive, inter alia because they require a comparatively large chip area in the case of integration. in a chip