The use of audio “fingerprints” or “signatures” has been known in the art, and was partly pioneered by such companies as Arbitron for audience measurement research. Audio signatures are typically formed by sampling and converting audio from a time domain to a frequency domain, and then using predetermined features from the frequency domain to form the signature. The frequency-domain audio may then be used to extract a signature therefrom, i.e., data expressing information inherent to an audio signal, for use in identifying the audio signal or obtaining other information concerning the audio signal (such as a source or distribution path thereof). Suitable techniques for extracting signatures include those disclosed in U.S. Pat. No. 5,612,729 to Ellis, et al. and in U.S. Pat. No. 4,739,398 to Thomas, et al., both of which are incorporated herein by reference in their entireties. Still other suitable techniques are the subject of U.S. Pat. No. 2,662,168 to Scherbatskoy, U.S. Pat. No. 3,919,479 to Moon, et al., U.S. Pat. No. 4,697,209 to Kiewit, et al., U.S. Pat. No. 4,677,466 to Lert, et al., U.S. Pat. No. 5,512,933 to Wheatley, et al., U.S. Pat. No. 4,955,070 to Welsh, et al., U.S. Pat. No. 4,918,730 to Schulze, U.S. Pat. No. 4,843,562 to Kenyon, et al., U.S. Pat. No. 4,450,551 to Kenyon, et al., U.S. Pat. No. 4,230,990 to Lert, et al., U.S. Pat. No. 5,594,934 to Lu, et al., European Published Patent Application EP 0887958 to Bichsel, PCT Publication WO/2002/11123 to Wang, et al. and PCT publication WO/2003/091990 to Wang, et al., all of which are incorporated herein by reference in their entireties. The signature extraction may serve to identify and determine media exposure for the user of a device.
While audio signatures have proven to be effective at determining exposures to specific media, audio signature systems provide little to no semantic information regarding the media. As used herein below, the terms “semantic,” “semantic information,” “semantic audio signatures,” and “semantic characteristics” refer to information processed from time, frequency and/or amplitude components of media audio, where these components may serve to provide generalized information regarding characteristics of the media, such as genre, instruments used, style, etc., as well as emotionally-related information that may be defined by a customizable vocabulary relating to audio component features (e.g., happy, melancholy, aggressive). This may be distinguished from “audio signatures” that are used to provide specific information that is used for media content identification, media content distributor identification and media content broadcaster identification (e.g., name of program, song, artist, performer, broadcaster, content provider, etc.).
Some efforts have been made to semantically classify, characterize, and match music genres and are described in U.S. Pat. No. 7,003,515, titled “Consumer Item Matching Method and System,” issued Feb. 21, 2006 and is incorporated by reference herein. However, these efforts often rely on humans to physically characterize music. Importantly, such techniques do not fully take advantage of audio signature information together with semantic information when analyzing audio content. Other efforts have been made to automatically label audio content for Music Information Retrieval Systems (MIR), such as those described in U.S. patent application Ser. No. 12/892,843, titled “Automatic labeling and Control of Audio Algorithms by Audio Recognition,” filed Sep. 28, 2010, which is incorporated by reference in its entirety herein. However such systems can be unduly complex and also do not take full advantage of audio signature technology and semantic processing. As such, there is a need in the art to provide semantic information based on generic templates that may be used to identify semantic characteristics of audio, and to use the semantic characteristics in conjunction with audio signature technology. Additionally, there is a need to identify such characteristics for the purposes of audience measurement. Currently advertisers target listeners by using radio ratings. These rating are gathered by using encoding or audio matching systems. As listening/radio goes to a one-to-one experience (e.g. Pandora, Spotifiy, Songza, etc.), there is a need for advertisers to be able to target listeners by the style of music they listen, along with other related information. Semantic analysis can identify this information and provide useful tools for targeted advertisement. Furthermore, semantic information may be used to provide supplemental data to matched audio signature data.