This invention relates to acoustic feedback in a communications device and specifically to speakerphone station sets and particularly to reduction of singing caused by feedback of a speaker output to the station set microphone. It also relates, in general, to any system in which audio output of a speaker may feed back into a microphone of the system causing singing (positive feedback) to occur. It specifically concerns a method and apparatus for determining the level of acoustic energy due to the output of a speaker appearing at a microphone of the communication device and to identifying such feedback energy as differentiated from that of the spoken input to the microphone.
The amount of acoustic energy output of a speaker being fed back into a microphone of a duplex acoustic system with gain (i.e., a device used for communication purposes) determines the system acoustic stability. Such stability is important to prevent the generation of xe2x80x9csingingxe2x80x9d in which feedback of the speaker output onto the microphone causes reinforcement of sound from the loudspeaker and thus causes the speaker to emit a howl or similar high-pitched noise.
There are existing methods of preventing this singing effect that operate by inserting switched loss into either the speaker or microphone path to ensure system stability. The amount of switched loss to insert is determined by comparing the microphone signal level to the speaker signal level from the network via a hybrid connected to the speakerphone. Examination of the relative levels of the two signals permits a determination as to which signal level is presently active (i.e. speaker output or voice input). Loss is inserted in the path which is determined to be presently inactive ensuring that the total loop electro-acoustic gain of the speakerphone and the network is less than one at the frequency where at zero degrees loop phase shift is experienced. This criterion, known as the Nyquist stability criterion, determines how much loss must be present in the electro-acoustic loop consisting of the speakerphone and the network to sustain oscillations, in order to ensure stability. The overall loss inserted, in many arrangements, to maintain stability is related to the sum of signal-dependent switched loss and some fixed loss amount, which is needed to provide xe2x80x9csingxe2x80x9d margin to compensate for inaccuracies in determination of the total amount of loop gain necessary to prevent oscillations at specific frequencies.
The difficulty of these implementations has been in determining the amount of coupling which exists between the speakerphone""s speaker and its microphone (i.e., speaker output vs. voice input). The acoustic environment between speaker and microphone is often unstable making a determination between speaker feedback and voice input to the microphone difficult to assess. In another arrangement, it has been thought possible to have the relative signal levels determined at the hybrid connection of the speakerphone to the telephone network. It is theoretically possible to sample incoming and outgoing speech at the hybrid connecting the phone to the network to infer loop gain, but this method has difficulties due to the isolation loss of the hybrid and is often unsatisfactory
In an exemplary embodiment of the invention, identification of signals (i.e., voice input or speaker output) in a process for reducing acoustic feedback, in a communication device, is accomplished by adding a signature noise (i.e., an identification mark) to output signals radiated by the speaker to enable these signals to be separated from speech input to the microphone. Having identified the signal (i.e., speech output) likely to cause a xe2x80x9csingingxe2x80x9d phenomenon, appropriate insertion loss to reduce the feedback may be added to the appropriate speech path within the communication device, to reduce a probability of singing.
In the exemplary embodiment of the invention, the signature noise, applied to the speech output, comprises a psuedo-noise signal consisting of a digitally generated sequence (i.e., a PN sequence). The envelope of the speech signal fed to the loudspeaker modulates this PN sequence.
The xe2x80x9csignaturexe2x80x9d (i.e., PN sequence) added to speech issuing from the loud speaker identifies it in contrast to voice speech input to the microphone allowing it to be used to assist in any loss-switching process. In creating the signature, the speech output of the loudspeaker is combined with a pseudo-noise signal waveform consisting of a digitally generated sequence. The envelope of the speech that is fed to the loudspeaker modulates the PN signal. As such, it represents a low-level, xe2x80x9cbackgroundxe2x80x9d pink noise signal whose amplitude is proportional to the envelope of the speech that issues from the loudspeaker.
The speech input to the microphone is correlated with a version of the PN sequence, such that the correlated result is in direct proportion to the amount of speech sampled by the microphone issuing from the loudspeaker. Voice input to the microphone does not contain the PN sequence and its level may be separately ascertained. As part of the PN detection process the voice input speech is largely ignored so as to be independent from the PN correlation output. For wideband acoustic systems, the technique may be applied with pink noise xe2x80x9cbandsxe2x80x9d, which utilize separate PN sequences. In such an embodiment, separate correlators may be used to adjust loss in various portions of the audio pass band to effect stability control, minimizing degradation of the entire program content due to feedback in only one portion of the pass band.
A second PN sequence may also be used to characterize the acoustic coupling path between the speaker and microphone. This second PN sequence would be made orthogonal to the first PN sequence in order to avoid interference between the two, and would be sent at a constant level through the loudspeaker. This second PN sequence would then be received by the microphone and correlated against the transmitted sequence to determine the impulse response of the acoustic path. This impulse response is then used to control an acoustic echo canceller. The advantage of using a PN sequence in addition to human speech in an acoustic echo canceller is that the PN sequence is a broadband signal and, hence, more accurately probes the acoustic environment.