U.S. Pat. Nos. 7,006,555, 6,968,564, and 6,272,176 and U.S. Patent Publication 2005-0177361, which are hereby incorporated by reference, disclose methods of encoding and decoding inaudible auxiliary data in audio signals. These techniques have been used to encode data in the audio portion of TV programs for broadcast monitoring and audience measurement. In these applications, the inaudible codes must be recoverable from the audio signal despite distortions of the audio signal incurred during the broadcast of the programs. These distortions may include digital to analog (D/A) and analog to digital (A/D) conversions (and associated sampling operations) as well as lossy compression. While the methods have been developed to enable reasonable recovery of the encoded auxiliary data, they are not sufficiently robust for applications in which the audio signal is subjected to greater distortions, such as repeated sampling operations (e.g., including re-sampling occurring in a series of D/A and A/D conversions), time scale changes, speed changes, successive compression/decompression operations (e.g., including transcoding into different compression formats). These additional distortions occur when the program content is captured at a receiver, re-formatted and uploaded to the Internet, such as the case when TV programs are uploaded to web sites. For example, the audio portion of the TV program is captured from an analog output, converted to digital (which includes re-sampling), compressed in a format compatible with the content hosting web site, uploaded, and then transcoded into a format for storage on the content distribution servers of the web site and suitable for streaming in response to requests from the web site visitors.
Such distortions tend to weaken the embedded inaudible code signal preventing its recovery. Further, they make it more difficult for the decoder to synchronize the reading of the inaudible code. The start codes included with the code signal are often insufficient, or not processed effectively, to enable the decoder to ascertain the location and time scale of the inaudible code signal in the received audio signal.
This document describes methods for making spectral encoding methods more robust. These methods include methods for decoding that address weak signal and/or synchronization issues caused by distortions to the encoded audio signal. These methods also include improvements to the encoding method and corresponding decoding methods that improve the robustness of the embedded data to distortion.
One aspect of the invention is a device for decoding data embedded in an audio signal, in which the data is embedded by adjusting signal values at frequencies. The device comprises a memory in which is stored blocks of the media signal. A processor is in communication with the memory to obtain blocks of the audio signal. The processor executes instructions to:
perform an initial synchronization by converting blocks of the audio to Fourier magnitude data, pre-filtering the Fourier magnitude data to produce first pre-filtered blocks, summing the first pre-filtered blocks to produce a first accumulated block, and correlating the first accumulated block with a frequency domain pattern to detect a shift of an embedded code signal; and
perform decoding of variable code data at the detected shift by converting blocks of the audio to Fourier magnitude data at the detected shift, pre-filtering the Fourier magnitude data to produce second pre-filtered blocks, summing the second pre-filtered blocks to produce a second accumulated block, and correlating the accumulated block with code signals to detect one of the code signals.
One method of embedding data in an audio signal adjusts signal values at frequencies selected from among set of frequency locations in predetermined frequency bands. This method uses signal characteristics of the audio signal to select a pattern of frequencies from among the set of frequency locations that satisfy desired performance criteria for embedding data. It then embeds the data at the selected pattern of frequencies by adjusting the signal values at the frequencies. The selected pattern of frequencies varies according to the signal characteristics and the desired performance criteria.
A method of decoding data embedded in a media signal performs an initial approximation of time scale changes of the media signal using at least a portion of the embedded data in a first domain, such as the Fourier magnitude domain. It performs synchronization of the embedded data in a second domain, different from the first domain (e.g., phase), and decodes embedded data after synchronization.
Another decoding method employs a least squares method to detect embedded data at the frequencies, and uses the results of the least squares method to decode the embedded data from the media signal.
Further features will become apparent with reference to the following detailed description and accompanying drawings.