Automated recognition of human speech is now possible, and generally involves the use of a microphone to convert the acoustic energy of a spoken word into an electrical signal. The electrical signal is then analyzed by a processor and which is capable of recognizing the spoken word that was converted into the electrical signal by the microphone. After the spoken word is recognized, that word can serve as an instruction for a computer or other electronic device to take an action, such as a command to adjust the temperature setting in a room. The spoken word can also be converted to a typed word, so a person can dictate a letter or other document which is then converted to a typed document without any further human interaction. Other uses of automatic speech recognition are also possible.
People use multiple different languages around the world, and some languages use sounds that are not heard in other languages. Some languages also use the tone or pitch of the spoken word to impart meaning, so proper understanding requires not only recognizing the sounds, but also how the sounds are pronounced. Many of the sounds and tones used in various languages are generally spoken within specific frequency ranges, and these ranges vary widely for different sounds and words. Thus, the ability to detect and interpret sounds within a wide range of frequencies is important for an effective speech recognition system.
All languages use intonation, or tone and pitch, to convey emphasis, contrast, and emotional information. However, tonal languages use tone or pitch to distinguish the meaning of words. For example, phonetically identical words can have entirely different meanings if spoken with different inflections, such as (1) a flat inflection, (2) an increasing tone from the beginning of the word to the end of the word, (3) a falling tone from the beginning of the word to the end of the word, or (4) a tone that falls from the beginning of the word, but then increases for the last part of the word. Different tonal languages will use different types of tone or tone contours.
Sounds are detected by microphones and converted into electrical signals. However, different microphones have different frequency responses, which means that some microphones are more sensitive and effective at converting sounds to electrical signals at certain sound frequencies, and other microphones are more sensitive and effective at other frequencies. Ideally, a microphone will be sensitive and effective at the frequency of the spoken word; however, there is a wide variety of frequencies used in human speech. As a result, some words are not perfectly recognized, and the resulting speech conversion may be inaccurate.
Accordingly, it is desirable to provide a speech recognition system having improved accuracy. Embodiments described herein contemplate the use of an array of microphones with a plurality of different frequency responses to improve the quality of the speech conversion. Furthermore, other desirable features and characteristics of the present invention will become apparent from the subsequent detailed description and the appended claims, taken in conjunction with the accompanying drawings and the foregoing technical field and background.