The present invention relates to techniques for automatically identifying musical pieces by monitoring the content of an audio signal.
Several techniques have been devised in the past to achieve this goal. Many of the techniques rely on side information extracted, for example, from side-band modulation (in FM broadcast) or depend on inaudible signals (watermarks) having been inserted in the material being played.
A few patents describe techniques that seek to solve the problem by identifying songs without any side information by extracting xe2x80x9cfingerprintsxe2x80x9d from the song itself, see e.g.:
Patent (U.S. Pat. No. 4,230,990) which describes a system that relies on a frequency domain analysis of the signal, but also requires the presence of a predetermined xe2x80x9csignaling eventxe2x80x9d such as a short single-frequency tone, in the audio or video signal.
Patent (U.S. Pat. No. 3,919,479) which describes a system designed to identify commercials in TV broadcasts. The system extract a low-frequency envelope signal and correlates it with signals in a database.
However, the systems described in these patents suffer significant drawbacks:
The fingerprint matching technique is usually based on a cross-correlation, which is typically a costly process and is impractical when large databases of fingerprints are to be used.
The fingerprint which is extracted from the signal is not very robust to signal alterations such as coding artifacts, distortion, spectral coloration, reverberation and other effects that might have been added to the material.
Accordingly, simple identifications techniques that are robust to signal alterations are required.
According to one aspect of the invention, musical pieces (e.g., a given song by a given artist) can be automatically identified by monitoring the content of the audio signal. A typical example is a device that continuously listens to a radio broadcast, and is able to identify the music being played without using any side information or watermarking technique, i.e., the signal being listened to was not preprocessed in any manner (for example, to insert inaudible identifying sequences, as in watermarking).
According to another aspect of the invention, the method comprises the acts of extracting a fingerprint from the first few seconds of audio, and then comparing this fingerprint to those stored in a large database of songs. Because it is desired to identify songs taken among a very large set (several hundreds of thousands), the fingerprint matching process is extremely simple, since it requires comparing the fingerprint to several hundreds of thousands, and yields a reliable result in a small amount of time (less than a second, for example).
According to another aspect of the invention, the identification process is fairly robust to alterations that might be present in the signal, such as audio coding/decoding artifacts, distortion, spectral coloration, reverberation and so on. These alterations might be undesirable, for example resulting from defects in the coding or transmission process, or might have been added on purpose (for example, reverberation or dynamic range compression). In either case, these alterations do not prevent the identification of the musical piece.
According to another aspect of the invention, subband energy signals, having a magnitude in dB, are extracted from overlapping frames of the signal. A difference signal is then generated for each subband. The frequency components of the difference signals from the difference signals of the subbands is used as a fingerprint.
According to another aspect of the invention, the subband energy signals are smoothed so the fingerprint will be still be useful to identify a signal that has been subsequently altered. For example, the signal may have had reverb effects added.
According to another aspect of the invention, the fingerprint is compared to a fingerprint database to identify the audio signal.
According to another aspect of the invention, local maxima of selected parameter of the audio signal are located and a fingerprint monitoring period is located near a local maxima.
Other features and advantages of the invention will be apparent from the following detailed description and appended drawings.