A machine may be configured to interact with one or more users by identifying audio (e.g., audio content, such as a song, which may be a cover rendition of an original song or other reference song), for example, in response to a request for identification of the audio. One approach is to configure the machine to perform audio fingerprinting, with an aim towards identifying an audio recording given a sample of it, by comparing a query fingerprint of the sample against a reference fingerprint stored in a database and attempting to find a match. Audio fingerprinting typically involves identifying exact renditions of reference audio (e.g., reference songs), often in complex audio scenes. Audio fingerprinting systems are usually designed to be robust to audio degradations (e.g., encoding artifacts, equalization variations, or noise). However, audio fingerprinting systems typically consider cover versions (e.g., a live performance by a different artist) to be different songs.
Cover identification systems aim to identify a song when given an alternate rendition of it (e.g., live, remaster, or remix). A cover version generally retains the same melody as an original rendition of the song, but differs from the original rendition in other musical aspects (e.g., instrumentation, key, or tempo).
Cover song identification systems may attempt to identify when two different musical recordings are derived from the same music composition (e.g., Jimi Hendrix's “All Along the Watchtower” is a cover of the original by Bob Dylan). The cover of a song can be drastically different from the original recording, for at least the reason that it can change key, tempo, instrumentation, or musical structure. Automatic identification of cover songs typically involves representing the audio in a manner that is robust to such transformations.
Cover song identification may share some similarities with live song identification, since a live song may be a cover song. In live song identification, a typical task is to recognize a live song (e.g., at a concert) performed by a performer who may or may not be the original artist. In addition, there may be key variation, slight tempo variation, musical structure changes (e.g., for an artist that is known to improvise), or any suitable combination thereof. The audio signal may also be degraded (e.g., due to crowd noise or a bad microphone).
One or more audio pieces (e.g., musical pieces or spoken word pieces) may be performed during a live performance. For example, one or more songs may be performed, and a song may be performed with or without visual accompaniment (e.g., a video, a laser show, or a dance routine). In some situations, the performer of an audio piece is an artist that recorded the audio piece (e.g., as a studio recording or as a live recording). For example, a performer may perform a song that she wrote and recorded herself. In other situations, the performer of an audio piece is different from the artist that recorded the audio piece (e.g., as a studio recording or as a live recording). For example, a performer may perform a cover rendition of a song that was written and recorded by someone else.