This application claims the benefit of U.S. Provisional Application No. 60/009,553, filed Jan. 2, 1996, the disclosure of which is hereby incorporated herein by reference.
The present invention relates, in general, to vibration sensors, and more particularly to detectors for sensing acoustic and other vibrations and for producing corresponding electric output signals. Still more particularly, the invention relates to unique microelectromechanical (MEM) acoustic filters and to methods for fabricating such filters for vibration sensing and, in preferred embodiments, for sound recognition systems.
The frequency spectrum of vibrating systems contains information that can be used in many ways. This spectrum can be used for wear diagnostics, passive and active noise control, material characterization, structure identification, and for speech processing. In all of these applications, an analysis is performed on the vibration signature of a vibrating system. Thus, for example, a vibration is sensed by means of a preprocessor that can be a microphone or an accelerometer. The preprocessed information is analyzed using, for example, a Fast Fourier Transform (FFT) to extract the frequency spectrum of the vibration. This spectrum is then processed according to the specific application.
In the area of wear diagnostics, the frequency signature of a vibrating system is used to study the mechanical wear of the system. In such systems, the noise from a vibrating system is analyzed to determine its frequency spectrum. As a system wears, the spectrum changes and, depending on the change, it is possible to predict what part of the system needs to be replaced before the system fails. Such systems are important for applications requiring controlled shut-down, such as is required in electric power plants, nuclear power plants, aircraft, and the like.
In active noise control systems, a sound radiation frequency signature of a system is mapped, and loudspeakers are driven to produce controlled interference with the radiated sound. In addition, the frequency signature at various locations within the pattern of radiation is used to control a quality of the radiated sound. The radiation pattern of the system is detected by microphones, shaped polyvinylidene fluoride (PVDF) sensors, or accelerometers.
The elastic properties of a solid can be evaluated using its acoustical signature by exciting an object by ultrasonic waves. The signature is obtained by recording the acoustical signal reflected from the object at different distances. Materials can also be characterized by measuring the reflected time-frequency spectrum, and the velocity and/or the absorption spectrum of acoustic waves can be used to characterize the biochemical properties of liquids. It is also possible to identify a structure using the acoustic signature of the reflected and radiated sound wave. Since the structure signature is not only a function of the material, but of the geometry of the structure, the spectrum changes when the structure is moved. Therefore, a change in the acoustic signature of a structure may signal a change in its mechanical properties or a change in its location.
Speech processing is one of the most complicated applications of the acoustic signature detection and processing. Conventional speech processors extract information from sound using spectrograms created by Fast Fourier Transform (FFT) or Linear Predictive Coding (LPC) algorithm. These methods both present a frequency spectrum of an input acoustic signal to a processor which is used to analyze the signal according to a selected computational algorithm. However, this method of processing speech signals is distinguishable from, and is inferior to, the signal processing which occurs in the human ear. To improve on this method, speech processors based on physiological models of the ear have been developed. Such models are based on the mechanical response of the ear, including the response of the outer ear, the middle ear, and the inner ear which contains the cochlea. The cochlea contains elements, including the basilar membrane, which are responsible for converting the acoustical signals into electrical signals on the nerve fibers of the ear. The mechanical properties of the basilar membrane change along its length so that it is capable of resonance at frequencies from about 50 Hz up to about 20 KHz. When sound excites the cochlea, it locally vibrates the basilar membrane. The location of the vibration along the basilar membrane depends on the exciting frequency, so that the membrane acts as a mechanical bandpass filter.
Attempts have been made to replicate the functions of the ear in mechanical or electrical structures to produce an artificial ear, as for speech recognition systems and the like. Thus, a basic speech recognition system might include a transducer or microphone for converting the speech sound waves into electrical signals, an FFT or LPC algorithm for preprocessing the received signals, and a pattern matching and classification processing section, where a digitized version of the speech signal is compared with stored word patterns from a memory. By applying a decision rule processing algorithm to the pattern-match speech signal and stored word patterns, an identification output corresponding to the speech sound waves is provided. The pattern matching, classification and identification components of a speech recognition system can generally be represented in three ways: first, through models based on conventional algorithms; second, through models based on neural networks; and third, models based on human physiology. Models which are based on neural networks receive a signal that has been preprocessed using FFT or LPC algorithms and deliver them to a simulated neural network. Active elements which represent neurons and synapses within the network control the quality of a connection between a first neuron and a second neuron. Such models are widely used for pattern recognition.
Models based on human physiology imitate the function of the human ear in order to perform speech recognition. In such models, preprocessing is carried out by components which represent the outer ear, the middle ear and the inner ear, including the cochlea. As noted above, the cochlea contains all the parts that are responsible for converting the acoustic signals into electrical signals which are sent to the nerve fibers of the ear. Models based on human physiology include software models of the physical structure of the cochlea for channeling the acoustic signals along the cochlea, as well as models of the tectorial, reissner and the basilar membranes, as well as the inner and outer hair cells.
Since the mechanical resonance properties of the basilar membrane change along its length, when sound enters the ear, it excites the membrane and produces a travelling wave that decays sharply after the resonance point is reached. Thus, the basilar membrane and the cochlea provide a mechanical filter that divides the received frequency spectrum into bands of frequencies. The response of the basilar membrane is digitized by the inner hair cells attached along its length, these inner cells being mechanico-electric transducers that translate the mechanical vibrations of the basilar membrane into electrical signals. The inner ear cells are connected to the nerve fibers that are the electrical input for the physiological system that processes the sound.
Many aspects of the hearing process have been studied, and it is possible to find models which imitate various ear functions such as sound localization, spectral estimation, and pitch perception. In addition, auditory-based models performing speech processing have been developed. These models are based on speech coding in the auditory periphery through the average firing rate of the auditory fibers their discharge synchrony and temporal activity, and the tototopic organization of the basilar membrane.
When the performance of conventional signal processing techniques and physiological approaches as modeled on a computer were compared, it was discovered that the physiological spectrogram was better in identifying steady-state vowel formants in a noise background. Therefore, it appeared that the physiological modeling was more efficient than other modeling methods. In physiological modeling, however, the basilar membrane motion was replaced by a model of a filter bank, and the responses of the hair cells and of the nerve fibers were modeled to resulted in a simulated physiological preprocessing of the sound by operation of a computer program. This approach, as well as other modeling approaches, were simply attempts to use a transducer output from a conventional microphone for speech recognition, while trying to overcome the shortcomings of that transducer through the application of computational algorithms. However, none of these models benefit from the natural function of the physiological acoustical filter of the human ear.
A need exists, then, for an acoustic filter which can be made to function in the same manner as a cochlea. This filter would ideally be extremely small and would be capable of being fabricated on semiconductor chips for integration with signal processing circuitry.
Attempts have been made to produce extremely small acoustic transducers utilizing standard polysilicon micromachining methods, but these devices necessarily suffer from the shortcomings of that technology; i.e., certain transducer shapes are not possible because mechanical elements executed in polysilicon are limited to a maximum thickness of approximately 2 micrometers. This limitation makes polysilicon unsuitable for high aspect ratio structures, which are much greater in thickness than in width; in addition, such elements are usually relatively expensive to produce in polysilicon.
An unmet need exists for an extremely small acoustic sensor, and particularly to such a sensor which can function as a mechanical filter in the manner of a cochlea, and still more particularly for an inexpensive structure which can be fabricated into semiconductor chips and integrated with signal processing circuitry.