The video and/or audio received by video and/or audio receivers are monitored for a variety of reasons. For example, such monitoring has been used to detect when copyrighted video and/or audio has been transmitted so that appropriate royalty calculations can be made. Other examples of the use of such monitoring include determining whether a receiver is authorized to receive the video and/or audio, and determining the sources or identities of video and/or audio.
One approach to monitoring video and/or audio is to add ancillary codes to the video and/or audio at the time of transmission or recording and to detect and decode the ancillary codes at the time of receipt by a receiver or at the time of performance by a player. There are many arrangements for adding an ancillary code to video and/or audio in such a way that the added ancillary code is not noticed when the video is viewed on a monitor and/or when the audio is supplied to speakers. For example, it is well known in television broadcasting to hide such ancillary codes in non-viewable portions of video by inserting them into either the video's vertical blanking interval or horizontal retrace interval. An exemplary system which hides ancillary codes in non-viewable portions of video is referred to as “AMOL” and is taught in U.S. Pat. No. 4,025,851.
Other known video encoding systems have sought to bury the ancillary code in a portion of a video signal's transmission bandwidth that otherwise carries little signal energy. An example of such a system is disclosed by Dougherty in U.S. Pat. No. 5,629,739.
An advantage of adding an ancillary code to audio is that the ancillary code can be detected in connection with radio transmissions and with pre-recorded music as well as in connection with television transmissions. Moreover, ancillary codes, which are added to audio signals, are reproduced in the audio signal output of a speaker and, therefore, offer the possibility of non-intrusive interception and decoding with equipment that has a microphone as an input. Thus, the reception and/or playing of audio can be monitored by the use of portable metering equipment.
One known audio encoding system is disclosed by Crosby, in U.S. Pat. No. 3,845,391. In this system, an ancillary code is inserted in a narrow frequency “notch” from which the original audio signal is deleted. The notch is made at a fixed predetermined frequency (e.g., 40 Hz). This approach led to ancillary codes that were audible when the original audio signal containing the ancillary code was of low intensity.
A series of improvements followed the Crosby patent. Thus, Howard, in U.S. Pat. No. 4,703,476, teaches the use of two separate notch frequencies for the mark and the space portions of a code signal. Kramer, in U.S. Pat. No. 4,931,871 and in U.S. Pat. No. 4,945,412 teaches, inter alia, using a code signal having an amplitude that tracks the amplitude of the audio signal to which the ancillary code is added.
Microphone-equipped audio monitoring devices that can pick up and store inaudible ancillary codes transmitted in an audio signal are also known. For example, Aijalla et al., in WO 94/11989 and in U.S. Pat. No. 5,579,124, describe an arrangement in which spread spectrum techniques are used to add an ancillary code to an audio signal so that the ancillary code is either not perceptible, or can be heard only as low level “static” noise. Also, Jensen et al., in U.S. Pat. No. 5,450,490, teach an arrangement for adding an ancillary code at a fixed set of frequencies and using one of two masking signals, where the choice of masking signal is made on the basis of a frequency analysis of the audio signal to which the ancillary code is to be added.
Moreover, Preuss et al., in U.S. Pat. No. 5,319,735, teach a multi-band audio encoding arrangement in which a spread spectrum ancillary code is inserted in recorded music at a fixed ratio to the input signal intensity (code-to-music ratio) that is preferably 19 dB. Lee et al., in U.S. Pat. No. 5,687,191, teach an audio coding arrangement suitable for use with digitized audio signals in which the code intensity is made to match the input signal by calculating a signal-to-mask ratio in each of several frequency bands and by then inserting the code at an intensity that is a predetermined ratio of the audio input in that band. As reported in this patent, Lee et al. have also described a method of embedding digital information in a digital waveform in pending U.S. application Ser. No. 08/524,132.
It will be recognized that, because ancillary codes are preferably inserted at low intensities in order to prevent the ancillary code from distracting a listener of program audio, such ancillary codes may be vulnerable to various signal processing operations. For example, although Lee et al. discuss digitized audio signals, it may be noted that many of the earlier known approaches to encoding an audio signal are not compatible with current and proposed digital audio standards, particularly those employing signal compression methods that may reduce the signal's dynamic range (and thereby delete a low level ancillary code) or that otherwise may damage an ancillary code. In many applications, it is particularly important for an ancillary code to survive compression and subsequent de-compression by such algorithms as the AC-3 algorithm or the algorithms recommended in the ISO/IEC 11172 MPEG standard, which is expected to be widely used in future digital television transmission and reception systems.
It must also be recognized that the widespread availability of devices to store and transmit copyright protected digital music and images has forced owners of such copyrighted materials to seek methods to prevent unauthorized copying, transmission, and storage of their material. Unlike the analog domain, where repeated copying of music and video stored on media, such as tapes, results in a degradation of quality, digital representations can be copied without any loss of quality. The main constraints preventing illegal reproductions of copyrighted digital material is the large storage capacity and transmission bandwidth required for performing these operations. However, data compression algorithms have made the reproduction of digital material possible.
A popular compression technology known as MP3 can compress original audio stored as digital files by a factor of ten. When decompressed the resulting digital audio is virtually indistinguishable from the original. From a single compressed MP3 file, any number of identical digital audio files can be created. Currently, portable devices that can store audio in the form of MP3 files and play these files after decompression are available.
In order to protect copyrighted material, digital code inserting techniques have been developed where ancillary codes can be inserted into audio as well as video digital data streams. The ancillary codes are used as digital signatures to uniquely identify a piece of music or an image. As discussed above, many methods for embedding such imperceptible ancillary codes in both audio and video data are currently available. While such ancillary codes provide proof of ownership, there exists a need for the prevention of distribution of illegally reproduced versions of digital music and video.
In an effort to satisfy this need, it has been proposed to use two separate ancillary codes that are periodically embedded in an audio stream. For example, it is suggested that the ancillary codes be embedded in the audio stream at least once every 15 seconds. The first ancillary code is a “robust” ancillary code that is present in the audio even after it has been subjected to fairly severe compression and decompression. A two-channel or stereo digital audio stream in its original form may carry data at a rate of 1.5 megabits/second. A compressed version of this stream may have a data rate of 96 kilobits/second. This reduction in data rate is achieved by means of “lossy compression” algorithms. In this approach, the inability of the human ear to detect the presence of a low power frequency when there is a neighboring high power frequency is exploited to modify the number of bits used to represent each spectral value. Yet the audio recovered by decompressing the latter will still carry the robust ancillary code.
The second ancillary code is a “fragile” ancillary code that is also embedded in the original audio. This second ancillary code is erased during the compression/decompression operation. The robust ancillary code contains a specific bit that, if set, instructs the software in a compliant player to perform a search for the “fragile” ancillary code and, if not set, to allow the music to be played without such a search. If the compliant player is instructed to search for the presence of the fragile ancillary code, and if the fragile ancillary code cannot be detected by the compliant player, the compliant player will not play the music.
Additional bits in the robust ancillary code also determine whether copies of the music can be made. In all, twelve bits of data constitute an exemplary robust ancillary code and are arranged in a specified bit structure.
A problem with the “fragile” ancillary code is that it is fragile and may be difficult to receive even when there is no unauthorized compression/decompression. Accordingly, one embodiment of the present invention is directed to a pair of robust ancillary codes useful in detecting unauthorized compression. The first ancillary code consists of twelve-bits conforming to the specified bit structure discussed above, and the second ancillary code consists of thirteen-bits forming a descriptor that characterizes a part of the audio signal in which the ancillary codes are embedded. In a player designed to detect compression, both of the ancillary codes are extracted irrespective of whether or not the audio material has been subjected to a compression/decompression operation. The detector in the player independently computes a thirteen-bit descriptor for the received audio and compares this computed thirteen-bit descriptor to the embedded thirteen-bit descriptor. Any difference that exceeds a threshold will generate a screening trigger indicating unauthorized compression. The descriptor used in the proposed method is based on entropy calculations and shows a significant change when any modifications to the original audio are made.