Many user devices, such as smartphones, tablet computers, laptop computers, desktop computers, and home automation devices, can be at least partially operated by voice commands and inquiries. Voice-controlled devices can monitor, record, process, and respond to speech within range of the device; typically, audio input is collected with a microphone or microphone array, and audio output is presented through one or more loudspeakers. Various input and output functions of the device benefit from locating the speech source with respect to the device. A microphone array can use beamforming techniques to focus the signal detection toward the source location. A loudspeaker that best directs the audio output toward the location can be selected from multiple differently-oriented loudspeakers. A line array or other loudspeaker array can use beam steering techniques to direct the audio output toward the location. Processing of spoken commands can depend on the source location; for example, receiving a commend to “turn on the lights,” the device may determine from the source location which room the speaker is standing in, and turn on the lights for that room.
The speed and accuracy with which a voice-controlled device detects and responds to spoken commands and inquiries can be improved by optimizing the signal processing hardware and device logic to quickly and accurately determine the azimuth of the speech source with respect to the device.