The present invention relates generally to speech or voice recognition and deals more specifically with speech detection in high noise environments to activate a voice operated switch. The invention deals more particularly with a method and related apparatus which distinguishes speech or voice from other sounds over a wide range of noise levels to activate a voice operated switch in response to only speech or voice signals.
A voice operated switch, commonly referred to in the trade as VOX is often used to activate some device or apparatus, such as, for example, a telephone speakerphone amplifier and transmitter, radio transmitter, audio amplifier or the like wherein the VOX is designed to respond to a user's voice or some other sound to activate the device to allow "handsfree" operation thus freeing the user's hands for other tasks. Such voice operated switches or VOX's are particularly useful with radio communication devices, such as, headphone radio transmitters of the type generally used at industrial, manufacturing and construction sites. Typically, such a VOX communication device includes a microphone, radio transmitter/receiver and headphones to provide two-way audio communication between users who may be separated from one another by some distance, for example, between a crane operator located substantially above the ground and ground personnel directing the operations of the crane operator who may be out of visual contact with respect to the activity site. Such VOX communication devices are also necessary in high ambient noise work environments to allow workers or supervisory personnel to communicate with one another in the presence of machine or other noise which would render normal voice communication, even at shouting levels, impossible. The utility of VOX communication devices is well known and understood by those in the art.
One problem generally associated with known VOX's is the inability or difficulty to readily discriminate between speech or voice and other sounds or environmental noise and a response delay is deliberately built in to insure that the input energy detected is likely to be voice or speech before the VOX is activated. This is the reason that the first portion of speech is often missing in communications utilizing VOX communication devices.
Another problem generally associated with known VOX's is the necessity to continually manually reset the threshold setting of the VOX to a single environmental noise level for a specific noise environment. This is a particular disadvantage if a user moves about between a number of different noise environments, particularly when moving from a high noise environment to a low noise environment. The user must speak or shout loudly enough in the low noise environment to exceed the preset threshold level set for the high noise environment to activate the VOX.
A yet further problem generally associated with known VOX's is that they become activated upon the energy level of any audible sound exceeding the threshold setting for the VOX thus causing the VOX communication device to become activated unexpectedly.
It would be useful therefore to provide a VOX that automatically adjusts the threshold setting to permit operation over a wide range of noise levels without the necessity of manually resetting the threshold levels to accommodate changing noise levels.
It would also be useful to provide a VOX that discriminates between noise energy and voice energy so that the VOX only responds to speech or voice to prevent accidental activation in high noise environments.
It is a general aim of the present invention therefore to provide a VOX that has a self-adjusting threshold level for activation in different level noise environments and one which discriminates between speech or voice and other sounds including noise energy to prevent accidental activation of the VOX.
It is a further aim of the present invention to provide a VOX which is easy to use, operates reliably in high noise environments, typically, 115 dB or higher.
It is a yet further aim of the present invention to provide a VOX which detects and discriminates between speech or voice and other sounds without the use of complicated and relatively expensive digital signal processing (DSP) techniques and circuitry.