Field of the Embodiments of the Invention
The various embodiments relate generally to audio signal processing and, more specifically, to a crowdsourced database for sound identification.
Description of the Related Art
Recent technological advancements in the consumer electronics industry have increased the portability and affordability of various types of media players, such as computers, mobile phones, and MP3 players. As a result, more and more consumers are integrating these types of devices into their daily lives. For example, an individual may use a computer to listen to music at work or use a mobile phone to listen to music or watch a video program during the commute to and from work.
In order to avoid disturbing others, many users listen to media players using a listening device, such as a pair of headphones. However, using headphones may reduce a user's ability to hear sounds in the ambient environment or communicate with those around the user. Moreover, many headphones provide noise-isolation and/or noise-cancellation functions designed to reduce the degree to which a user can hear ambient sounds. As such, a user may not be able to hear important sounds in the ambient environment, such as vehicle noise, sirens, or the voice of someone who is trying to get the attention of the user.
As a result of these issues, various techniques have been developed for detecting sounds in the ambient environment and, in response to detecting a sound, performing a specific action via a pair of headphones or a computing device. For example, some techniques enable sounds within the ambient environment to be selectively blocked by the headphones (e.g., via noise cancellation) or passed to the user, depending on preferences selected by the user. Additionally, some techniques enable audio playback to be paused upon detecting a particular sound in the ambient environment.
Although systems that implement such techniques are able to detect generic sounds within the ambient environment with an acceptable degree of accuracy, these systems typically are less effective at detecting specific types of sounds. For example, although a conventional system may be preprogrammed to recognize generic traffic noise, the system may not accurately identify the sound characteristics of a specific vehicle encountered by a user. Further, such systems cannot reasonably be preprogrammed to detect all of the potential sound types a user may encounter. For example, a user who is working on a construction site may wish to block the noise produced by a specific type and brand of power tool. However, conventional systems cannot reasonably be preprogrammed to identify all possible types of power tools that could be encountered by a user.
As the foregoing illustrates, more effective techniques for enabling a user to interact with his or her surroundings while operating a listening device, such as a pair of headphones, would be useful.