1. Field of the Invention
The present invention relates to signal recognition, and more specifically to a method for automatically identifying audio content such as a sound recording.
2. Description of Related Art
The development of efficient digital encoding methods for audio (e.g., the Motion Picture Experts Group Layer 3 standard known also as MP3), in combination with the advent of the Internet, has opened up the possibility for the entirely electronic sale and distribution of recorded music. This is a potential boon to the recording industry. On the downside, the technical advances also abet the illegal distribution of music. This poses a threat to the propriety interests of recording artists and music distributors. The ease of distributing high fidelity digital copies that do not degrade over successive generations is a far greater problem to the music industry than the limited copying of music onto audio cassettes that occurred prior to the advent of digital audio. Presently, there are a myriad of Internet sites from which a person can obtained bootleg copies of copyrighted music. Thus, for music copyright enforcement, there is a need for a system and method for the automated identification of audio content.
The identification of music from a digital audio file, such as an MP3 file, is not a trivial problem. Different encoding schemes will yield a different bit stream for the same song. Even if the same encoding scheme is used to encode the same song (i.e., sound recording) and create two digital audio files, the files will not necessarily match at the bit level. Various effects can lead to differentiation of the bit stream even though the resulting sound differences as judged by human perception are negligible. These effects include: subtle differences in the overall frequency response of the recording system, digital to analog conversion effects, acoustic environmental effects such as reverb, and slight differences in the recording start time. Further, the bit stream that results from the application of a given encoding scheme will vary depending on the type of audio source. For example, an MP3 file of a song created by encoding the output of a Compact Disc (CD) will not match at the bit level with an MP3 file of the same song created by encoding the output of a stereo receiver.
One solution that has been proposed is to tag copyrighted music by using digital watermarking. Unfortunately numerous methods have been discovered for rendering digital watermarks illegible. In addition, there are forms of noise and distortion that are quite audible to humans, but that do not impede our ability to recognize music. FM broadcasts and audio cassettes both have a lower bandwidth than CD recordings, but are still copied and enjoyed by some listeners. Likewise, many of the MP3 files on the Internet are of relatively low quality, but still proliferate and thus pose a threat to the profitability of the music industry. Furthermore, some intentional evasions of copyright protections schemes involve the intentional alteration or distortion of the music. These distortions include time-stretching and time-compressing. In such cases, not only may the start and stop times be different, but the song durations may be different as well. All such differences may be barely noticeable to humans, but can foil many conventional copyright protection schemes.
Another problem for the music industry and songwriters is the unauthorized use of samples. Samples are short sections of a song that have been clipped and placed into another song. Unless such a sample can be found and identified, the owner of the copyright on the original recording will not be fairly compensated for its use in the derivative work.
There is a need for a method that can identify audio content such as sound recordings despite subtle differences and alterations that arise during processes such as recording, broadcasting, encoding, decoding, transmission, and intentional alteration.