1. Field of the Invention
The present invention relates to the monitoring of audio content, and more specifically to systems and methods for monitoring the audio channel of a video broadcast and automatically generating audio content information.
2. Description of Related Art
Copyrighted audio content such as music is frequently used in video broadcasts. For example, the video broadcast of a typical television show has music being played before and after commercial breaks and in the background during the show. When a copyrighted song is used during such a video broadcast, the copyright holder is entitled to a royalty whose amount is based on the exact amount of time that the song is actually broadcast. However, this amount of time (and thus the royalty due) is often not known until the show is actually broadcast.
Currently, a manual review of each video broadcast is required to determine the songs that are played and the duration of each song so that the royalties due to the copyright holders can be calculated. This manual accounting process requires labor and is greatly prone to error because of the dependency on human perception and accuracy. Thus, royalty calculations are performed inefficiently and royalty payments are frequently incorrect. For efficient and accurate royalty calculations, there is a need for a reliable system and method for monitoring the audio channel of a video broadcast and automatically generating audio content information that can be used to make royalty calculations.
One difficulty in developing a practical system for automatically monitoring the audio channel of a video broadcast is providing a mechanism for automatically identifying audio content. One solution that has been proposed is to tag copyrighted music by using digital watermarking technology. Another solution is to identify the audio content itself However, the identification of music from any source is not a trivial problem. Different encoding schemes will yield a different bit stream for the same song. Even if the same encoding scheme is used to encode the same song (i.e., sound recording) and create two digital audio files, the files will not necessarily match at the bit level.
Various effects can lead to differentiation of the bit stream even though the resulting sound differences as judged by human perception are negligible. These effects include: subtle differences in the overall frequency response of the recording system, digital to analog conversion effects, acoustic environmental effects such as reverb, and slight differences in the recording start time. Further, the bit stream that results from a recording will vary depending on the type of audio source. For example, the bitstream for a song created by encoding the output of one stereo receiver will generally not match the bitstream for the same song created by encoding the output of another stereo receiver.
In addition, there are forms of noise and distortion that are quite audible to humans, but that do not impede our ability to recognize music. FM broadcasts and audio cassettes both have a lower bandwidth than CD recordings, but are still copied and enjoyed by some listeners. Likewise, many of the MP3 files on the Internet are of relatively low quality, but still proliferate and thus pose a threat to the profitability of the music industry. Furthermore, some intentional evasions of copyright protections schemes involve the intentional alteration or distortion of the music. These distortions include time-stretching and time-compressing. In such cases, not only may the start and stop times be different, but the song durations maybe different as well. All such differences may be barely noticeable to humans, but can foil many conventional copyright protection schemes.
There is a need for systems and methods for automatically calculating royalties for audio content such as sound recordings that are part of a video broadcast, such as by automatically and effectively identifying copyrighted audio content and the amount of time that it is actually broadcast.