Field of the Invention
The present invention relates generally to audio signal processing, and more particularly to identification of audio items such as songs within an audio signal.
Description of Related Art
An audio identification system takes as input a short audio segment, typically a few seconds in length, and finds a match within a specific recording (e.g. a song, or other audio item) in a database of audio items. Internally, the system extracts from the input audio certain feature sequences that are well suited for the audio matching task. Such sequences are used to search a database of known audio items, looking for a best match. The item that best matches the audio input is returned, or it is determined that a good match does not exist.
Popular systems, such as those available from SoundHound and Shazam allow a user to push a button on their smart phone to start capturing an audio segment and have the system automatically identify a recording that matches the captured audio, and a position within such a recording. The captured audio segment is transmitted over a network to a remote audio identification server. The server attempts to identify the audio item from the segment, and transmits audio identification information back to the device.
Audio identification can be resource intensive for a battery-powered, portable device. The processing and transmission by the device both consume precious battery power. In addition, transmission of large amounts of data during the identification process can be expensive for the user. Finally, the computational load of the servers that perform database lookups is another significant cost factor.
It is therefore useful to provide improved systems and methods for identifying audio items.