Different sounds may be produced by different actions and activities. For example, the sound of footsteps may be a result from a person walking in an area, and a breaking glass sound may be produced when a window breaks. Different devices and appliances may also produce different sounds. For example, a doorbell may produce a particular alert sound, a smoke detector may produce a particular alarm sound, an oven timer may produce a particular alert sound when cooking time is finished, and a toaster may produce a particular commonly recognizable sound when the toasting is complete. Some devices may also receive voice commands that result in useful actions such as controlling lights and appliances, and providing answers to common questions. These devices are generally standalone devices that have to be located reasonably close to the person giving the command. Positioning a voice-based device for reliable operation may sometimes be challenging and/or inconvenient. Reliably recognizing a particular sound or a voice command in a noisy environment (e.g., during a party) by a standalone device may also be challenging. Thus, a solution that enables more reliable sound-based operations is desirable.