There is considerable interest in identifying and/or measuring the receipt of, and or exposure to, audio data by an audience in order to provide market information to advertisers, media distributors, and the like, to verify airing, to calculate royalties, to detect piracy, and for any other purposes for which an estimation of audience receipt or exposure is desired. Additionally, there is a considerable interest in providing content and/or performing actions on devices based on media exposure detection. The emergence of multiple, overlapping media distribution pathways, as well as the wide variety of available user systems (e.g. PC's, PDA's, portable CD players, Internet, appliances, TV, radio, etc.) for receiving audio data and other types of data, has greatly complicated the task of measuring audience receipt of, and exposure to, individual program segments. The development of commercially viable techniques for encoding audio data with program identification data provides a crucial tool for measuring audio data receipt and exposure across multiple media distribution pathways and user systems.
One such technique involves adding an ancillary code to the audio data that uniquely identifies the program signal. Most notable among these techniques is the CBET methodology developed by Arbitron Inc., which is already providing useful audience estimates to numerous media distributors and advertisers. An alternative technique for identifying program signals is extraction and subsequent pattern matching of “signatures” of the program signals. Such techniques typically involve the use of a reference signature database, which contains a reference signature for each program signal the receipt of which, and exposure to which, is to be measured. Before the program signal is broadcast, these reference signatures are created by measuring the values of certain features of the program signal and creating a feature set or “signature” from these values, commonly termed “signature extraction”, which is then stored in the database. Later, when the program signal is broadcast, signature extraction is again performed, and the signature obtained is compared to the reference signatures in the database until a match is found and the program signal is thereby identified.
However, one disadvantage of using such pattern matching techniques is that, because there is no predetermined point in the program signal from which signature extraction is designated to begin, each program signal must continually undergo signature extraction, and each of these many successive signatures extracted from a single program signal must be compared to each of the reference signatures in the database. This, of course, requires a tremendous amount of data processing, which, due to the ever increasing methods and amounts of audio data transmission, is becoming more and more economically impractical.
In order to address the problems accompanying continuous extraction and comparison of signals, which uses excessive computer processing and storage resources, it has been proposed to use a “start code” to trigger a signature extraction.
One such technique, which is disclosed in U.S. Pat. No. 4,230,990 to Lert, et al., proposes the introduction of a brief “cue” or “trigger” code into the audio data. According to Lert, et al. upon detection of this code, a signature is extracted from a portion of the signal preceding or subsequent to the code. This technique entails the use of a code having a short duration to avoid audibility but which contains sufficient information to indicate that the program signal is a signal of the type from which a signature should be extracted. The presence of this code indicates the precise point in the signal at which the signature is to be extracted, which is the same point in the signal from which a corresponding reference signature was extracted prior to broadcast, and thus, a signature need be extracted from the program signal only once. Therefore, only one signature for each program signal must be compared against the reference signatures in the database, thereby greatly reducing the amount of data processing and storage required.
One disadvantage of this technique, however, is that the presence of a code that triggers the extraction of a signature from a portion of the signal before or after the portion of the signal that has been encoded necessarily limits the amount of information that can be obtained for producing the signature, as the encoded portion itself may contain information useful for producing the signature, and moreover, may contain information required to measure the values of certain features, such as changes of certain properties or ratios over time, which might not be accurately measured when a temporal segment of the signal (i.e. the encoded portion) cannot be used.
Another disadvantage of this technique is that, because the trigger code is of short duration, the likelihood of its detection is reduced. One disadvantage of such short codes is the diminished probability of detection that may result when a signal is distorted or obscured, as is the case when program signals are broadcast in acoustic environments. In such environments, which often contain significant amounts of noise, the trigger code will often be overwhelmed by noise, and thus, not be detected. Yet another specific disadvantage of such short codes is the diminished probability of detection that may result when certain portions of a signal are unrecoverable, such as when burst errors occur during transmission or reproduction of encoded audio signals. Burst errors may appear as temporally contiguous segments of signal error. Such errors generally are unpredictable and substantially affect the content of an encoded audio signal. Burst errors typically arise from failure in a transmission channel or reproduction device due to external interferences, such as overlapping of signals from different transmission channels, an occurrence of system power spikes, an interruption in normal operations, an introduction of noise contamination (intentionally or otherwise), and the like. In a transmission system, such circumstances may cause a portion of the transmitted encoded audio signals to be entirely unreceivable or significantly altered. Absent retransmission of the encoded audio signal, the affected portion of the encoded audio may be wholly unrecoverable, while in other instances, alterations to the encoded audio signal may render the embedded information signal undetectable.
In systems for acoustically reproducing audio signals recorded on media, a variety of factors may cause burst errors in the reproduced acoustic signal. Commonly, an irregularity in the recording media, caused by damage, obstruction, or wear, results in certain portions of recorded audio signals being irreproducible or significantly altered upon reproduction. Also, misalignment of, or interference with, the recording or reproducing mechanism relative to the recording medium can cause burst-type errors during an acoustic reproduction of recorded audio signals. Further, the acoustic limitations of a speaker as well as the acoustic characteristics of the listening environment may result in spatial irregularities in the distribution of acoustic energy. Such irregularities may cause burst errors to occur in received acoustic signals, interfering with recovery of the trigger code.
A further disadvantage of this technique is that reproduction of a single, short-lived code that triggers signature extraction does not reflect the receipt of a signal by any audience member who was exposed to part, or even most, of the signal if the audience member was not present at the precise point at which the portion of the signal containing the trigger code was broadcast. Regardless of what point in a signal such a code is placed, it would always be possible for audience members to be exposed to the signal for nearly half of the signal's duration without being exposed to the trigger code.
Yet another disadvantage of this technique is that a single code of short duration that triggers signature extraction does not provide any data reflecting the amount of time for which an audience member was exposed to the audio data. Such data may be desirable for many reasons, such as, for example, to determine the percentage of audience members who listen to the entirety of a particular commercial or to determine the level of exposure of certain portions of commercials broadcast at particular times of interest, such as, for example, the first half of the first commercial broadcast, or the last half of the last commercial broadcast, during a commercial break of a feature program. Still another disadvantage of this technique is that a single code that triggers signature extraction cannot mark “beginning” and “end” portions of a program segment, which may be desired, for example, to determine the time boundaries of the segment.
Accordingly, it is desired to (1) provide techniques for gathering data reflecting receipt of and/or exposure to audio data that require minimal processing and storage resources, (2) provide techniques for gathering data reflecting receipt of and/or exposure to audio data wherein the maximum possible amount of information in the audio data is available for use in creating a signature, (3) provide techniques for gathering data reflecting receipt of and/or exposure to audio data wherein a start code for triggering the extraction of a signature is easily detected, (4) provide techniques for gathering data reflecting receipt of and/or exposure to audio data wherein a start code for triggering the extraction of a signature can be detected in noisy environments, (5) provide techniques for gathering data reflecting receipt of and/or exposure to audio data wherein a start code for triggering the extraction of a signature can be detected when burst errors occur during the broadcast of the audio data, (6) provide techniques for gathering data reflecting receipt of and/or exposure to audio data wherein a start code for triggering the extraction of a signature can be detected even when an audience member is only present for part of the audio data's broadcast, (7) provide techniques for gathering data reflecting receipt of and/or exposure to audio data wherein the duration of an audience member's exposure to a program signal can be measured, (8) provide techniques for gathering data reflecting receipt of and/or exposure to audio data wherein the beginning and end of a program signal can be determined, (9), provide techniques for using code and/or signatures to trigger actions on a processing device, such as activating a web link, presenting a digital picture, executing or activating an application (“app”), and so on, and (10) provide data gathering techniques which are likely to be adaptable to future media distribution paths and user systems which are presently unknown.