Digitization of music and other types of audio information has given rise to digital storage libraries that serve as searchable repositories for music files and other audio clips. In some scenarios, a user may wish to search such repositories to locate a high quality or original version of an audio clip corresponding to a low quality or second-hand recorded clip. In an example scenario, upon hearing a song of interest being played in a public location (e.g., over a speaker system at a bar or shopping center), a user may record the song using a portable recording device, such as a mobile phone with recording capabilities or other such recording device. Since the resultant recording may include ambient noise as well as the desired song, the user may wish to locate a higher quality original version of the song. In another example, a user may record a song clip from a radio broadcast, and attempt to locate an official release version of the song by searching an online music repository.
Audio matching is a technique for locating a stored audio file corresponding to an audio clip (referred to herein as a probe audio clip) provided by a user. This technique for locating audio files can be particularly useful if the user has no searchable information about the audio file other than the audio clip itself (e.g., if the user is unfamiliar with a recorded song). To determine whether a stored audio file matches a probe audio clip provided by a user, an audio matching system can extract audio characteristics of the probe audio clip and match these extracted characteristics with corresponding characteristics of the stored audio file.
However, if the audio characteristics of the probe audio clip have been subjected to pitch shifting, time stretching, and/or other such transformations, audio matching between the probe audio clip and the corresponding stored audio file may not be reliable or accurate, since the audio characteristics of the transformed probe clip may no longer match those of the stored audio file. For example, a song recorded using a portable recording device in proximity of a speaker source may undergo a global volume change depending on the distance of the recording device from the audio source at the time of the recording. Moreover, audio information broadcast over the radio is sometimes subjected to pitch shifting, time stretching, and/or other such audio transformations, and therefore possesses modified audio characteristics relative to the originally recorded information. Such common transformations can reduce the effectiveness of audio matching when attempting to match a stored audio file with a probe audio clip, since the modified characteristics of the probe audio clip may yield a different descriptor than that of the stored audio data.
The above is merely intended to provide an overview of some of the challenges facing conventional systems. Other challenges with conventional systems and contrasting benefits of the various non-limiting embodiments described herein may become further apparent upon review of the following description.