A number of devices are typically used in communications devices such as handsets (mobile and wired telephones) and headsets (all types) for example, to detect the speech of a user. These devices include acoustic microphones, physiological microphones, and accelerometers.
One common device typically used for detecting speech is an acoustic pressure sensor or microphone. One example of an acoustic pressure sensor is an electret condenser microphone, which can currently be found in numerous mobile communication devices. These electret condenser microphones have been miniaturized to fit into mobile devices such as cellular telephones and headsets. A typical device might have a diameter of 6 millimeters (mm) and a height of 3 mm. The problem with these electret condenser microphones is that because the microphones are designed to detect acoustic vibrations in the air, they generally detect ambient acoustic noise in addition to the speech signal of interest. The received speech signal therefore often includes noise (such as engines, people, and wind), much of which cannot be removed without degrading the speech quality. The noise present in the received speech signal presents significant qualitative and functional problems for a variety of downstream speech processing applications of the host communication device, applications including basic voice services and speech recognition for example.
Another device used for detecting speech is a physiological microphone, also referred to as a “P-Mic”. The P-Mic detects body vibrations generated during speech through the use of a small gel-filled cushion coupled to a piezo-sensor. Since the gel cushion couples well to the human flesh and poorly to the air, the P-Mic can accurately detect speech vibrations when placed against the skin, even in high noise environments. However, this solution requires firm contact between the gel cushion and the skin to work effectively—a requirement the consumer market is unlikely to accept. Further, at a size of approximately 1.5 inches on a side, the P-Mic is typically too large for deployment into many consumer communication products. Additionally, the P-Mic is prohibitively expensive to see widespread use in consumer products such as headsets. Also, the P-Mic does not use a standard microphone electrical interface so additional circuitry is required in order to connect the P-Mic to an analog-to-digital converter, increasing both size and implementation cost.
Yet another common device typically used for detecting speech, which is similar in principle to the P-Mic, is a Bone Conduction Microphone (BCM). The BCM includes an accelerometer used to measure skin/flesh vibrations generated by speech. The accelerometer of the BCM measures its own displacement caused by speech vibrations. However, much like the P-Mic, accelerometers require good contact to work effectively and are currently too expensive and electronically cumbersome to be used in commercial communications products. Again, accelerometers cannot use a standard microphone electrical interface so additional circuitry is required to connect the accelerometer to an analog-to-digital converter, thereby increasing both size and implementation cost.
In the drawings, the same reference numbers identify identical or substantially similar elements or acts. To easily identify the discussion of any particular element or act, the most significant digit or digits in a reference number refer to the Figure number in which that element is first introduced (e.g., element 100 is first introduced and discussed with respect to FIG. 1).