Creative works, such as music or other recognizable audio recordings, are often employed in broadcast media, such as for example as background music in television shows or advertisements. Viewers often find interesting such background audio. Oftentimes, such background audio is presented in small snippets, and by the time a viewer determines that he wants to know more about the audio it has stopped playing, or is presently unidentifiable to the viewer. For example, music may be playing in the background of a television show and the viewer becomes interested in purchasing it but does not know the name of the song. In another example, a user may hear a song in a commercial advertisement that he remembers listening to many years ago, but cannot place the name of the song. Yet another example may be an actor making a statement for which the viewer has vague recollection of being part of a famous quote, poem, speech, book, a line from a show/movie, or another literary work, but the viewer does not know the name of such work.
Conventionally, to identify particular audio a user would employ a music recognition application executable from a portable computing device (e.g., cell phone) to record a snippet of the audio of interest, and communicate the snippet to a remote music database server for identification thereof. However, there are several disadvantages to this conventional approach. It may take too much time to access the mobile phone and/or music application. Consequently, by the time the viewer is ready to record the snippet, the audio may have stopped playing. Also, many viewers (e.g., on the order of hundreds or even thousands) may be attempting to identify audio concurrently. For example, if a song is played during a Super-Bowl halftime show, many viewers may attempt to identify the song at the same time. This can result in very high volume of queries including disparate snippets that can consume considerable bandwidth and processing resources. Given such volume, delays in providing respective viewers with media identification information can result due to the amount of information that must be captured, transmitted, and processed in order to determine identification. Additionally, respective captured audio snippets may contain substantial noise at varying levels that can compound processing overhead and/or lead to misidentification.