The present invention relates to a method and apparatus for carrying auxiliary data in a digital signal, such as an audio or video signal, without affecting the perceived quality of the signal. For example, the invention is suitable for use with digital broadcast streams and digital storage media, such as compact discs (CDs) and digital video discs (DVDs).
Schemes for communicating and storing digital data have become increasingly popular, particularly in the mass consumer market for digital audio, video, and other data. Consumers may now send, receive, store, and manipulate digital television, audio and other data content, such as computer games and other software, stock ticker data, weather data and the like. This trend is expected to continue with the integration of telephone, television and computer network resources.
However, in many cases it is desirable to control or monitor the use of such digital data. In particular, copyright holders and other proprietary interests have the right to control the distribution and use of their works, including audio, video and literary works.
Additionally, in many cases it is desirable to provide auxiliary data that provides information on a related digital signal. For example, for a musical audio track, it would be useful to provide data that indicates the name of the artist, title of the track, and so forth. As a further example, it would be useful to provide data for enforcing a rating system for audio/video content.
Other times, the auxiliary data need not be related to the primary data signal in which it is carried.
Furthermore, it would be desirable if the auxiliary data could be embedded into (e.g., carried with) the digital audio, video or other content (termed a xe2x80x9cprimary data signalxe2x80x9d) without noticeably degrading the quality of the primary data signal.
Commonly-assigned U.S. Pat. No. 5,822,360, entitled xe2x80x9cMethod and Apparatus for Transporting Auxiliary Data in Audio Signalsxe2x80x9d, incorporated herein by reference, discloses a scheme for creating a hidden or auxiliary channel in a primary audio, video or other digital signal by exploiting the limits of human auditory or visual perception. With this scheme, a pseudorandom noise carrier is modulated by the auxiliary information to provide a spread spectrum signal carrying the auxiliary information. A carrier portion of the spread spectrum signal is then spectrally shaped to simulate the spectral shape of a primary (e.g., audio) signal. The spread spectrum signal is then combined with the audio signal to produce an output signal carrying the auxiliary information as random noise in the audio signal.
However, it would be desirable to provide auxiliary data in a primary data signal by using the primary data signal itself rather than carrying additional bits in a separate auxiliary data signal.
In particular, it would be desirable to provide a system for embedding a plurality of auxiliary digital information bits into an existing primary digitally encoded signal to form an unobjectionable composite digital signal. The signal should be unobjectionable in that the auxiliary data is imperceptible to the casual listener, viewer, or user, or otherwise provided at a desired threshold level, whether imperceptible or not, in the primary data signal.
The system should alter some of the primary signal""s lower order bits to insert the auxiliary, hidden digital data. It would further be desirable for the data to be hidden to be any conceivable digital data, and for the primary signal to be any digitally sampled process.
It would be desirable if the auxiliary digital information bits could be embedded into an existing primary signal at any time, including, for example, when the primary data signal is created (e.g., during a recording session for an audio track), when the primary data signal is being distributed (e.g., during a broadcast, or during manufacture of multiple storage media such as compact discs), and when the primary data signal is being played (e.g., on a player in a consumer""s home).
It would also be desirable to manipulate a minimal number of bits in a primary data signal in order to carry the auxiliary data.
It would be desirable to provide approximate spectral shaping of the embedded data.
It would be desirable to provide dynamic and perceptual-based schemes for embedding data.
It would be desirable to provide the capability to embed the data in the compressed or uncompressed domain.
The present invention provides a system having the above and other advantages.
The system, termed xe2x80x9cDigital Hidden Data Transport (DHDT)xe2x80x9d, employs a noise-like information bearing signal, termed an auxiliary data sequence, that comprises auxiliary, hidden digital data. The auxiliary digital data to be combined with the primary signal is a low-level digital signal. Due to its low-level, this signal is usually imperceptible to the casual listener, viewer, or user, assuming that the primary signal has a large enough dynamic range. For example, for CD audio, the dynamic range of the primary signal is typically sixteen bits.
However, for high definition applications (such as DVD audio), the noise introduced by indiscriminate manipulation of the least perceptually significant bits (LPSBs) may be objectionable (e.g., perceptible or otherwise above a desired level). Therefore, it may be desirable to minimize the manipulation of the LPSBs. The present invention provides mechanisms for minimizing the. manipulation of lower order bits for reliably transporting the hidden data.
The invention is able to exploit human perception by manipulating lower order bits of digital samples of a primary data signal. Manipulation of the lower order bits generally has little or no impact on the perceptual quality of the primary data signal (e.g., audio or video).
A primary signal comprising digital audio is usually formed from successive samples, each having sixteen to twenty-four bits, for example. Assuming the bits are arranged in two""s complement notation, the highest order significant bit affects the sound of the samples the most. The next lower bit has less of an effect, and so on. The lowest order bits are less audible (or visible for video and still imagery) and can therefore be manipulated to hide digital information without noticeably degrading the overall quality of the primary data signal.
These low order bits that have negligible impact when they are perturbed are termed least perceptually significant bits (LPSBs). The LPSBs are essentially the least significant bits (LSBs). None, some or all of the bits in each sample of the primary signal can be used as LPSBs. However, in most applications, the number of LPSBs is much less than the number of bits (K) in each sample. For example, for a typical, digitally sampled audio signal with sixteen bits of dynamic range, one or two LPSBs may be used in each sample. The optimum number of LPSBs to use can be determined by experimentation to attain a desired perceptibility level.
Moreover, the number of manipulated LPSBs can vary for each sample.
To securely embed auxiliary data into a primary signal (e.g., in a carrier wave), the least perceptually significant bits are pseudo-randomly modulated. For example, a pseudo-random sequence may be modulated by an auxiliary data bit to provide an auxiliary data sequence that is less likely to be extracted by an unauthorized person (e.g., attacker). Generally, if the attacker does not know the sequence used at the encoder, the attacker will not be able to demodulate the hidden data or restore the primary signal.
A decoder end of the system may have support for self-synchronization. Generally, the decoder""s version of the PN sequence will not be correctly aligned in time with the encoder""s PN sequence. The correct time alignment is necessary for the decoder to demodulate the data properly. This is analogous to the problem of coherent demodulation in a receiver. Self-synchronization is therefore an important element of the system.
A decoder may be able to synchronize with the received data in some case, for example, if the decoder knows the frame boundaries. This may occur, e.g., when recovering frames from a DVD or other storage media, where the data is recovered starting at the beginning of a frame. Or, the decoder may be provided with the necessary synchronization information via a separate channel, or by other means. In these cases, a self-synchronization capability in the decoder is not required.
To meet the requirement of self-synchronization, the system embeds a check code, such as a Cyclic Redundancy Check (CRC) code, that allows a decoder to synchronize itself to the modulating sequence. CRC codes are frequently used in communications systems for error control. However, in most systems, CRC codes are used to check if the data was received error-free, not for the purpose of synchronization.
In one embodiment, a method for embedding an auxiliary data bit in a plurality of digital samples includes the steps of: (a) modulating a pseudo-random sequence by the auxiliary data bit to provide a pseudo-randomly modulated auxiliary data sequence, and (b) embedding the auxiliary data sequence in the plurality of samples by modifying at least one least perceptually significant bit (LPSB) of each of the plurality of samples according to the auxiliary data sequence to provide a composite signal with the auxiliary data bit embedded therein. Each sample has a plurality of bits, and a number of the LPSBs to replace in each of the samples is determined according to a desired perceptibility level of the auxiliary data sequence in the composite signal.
A corresponding decoding method, and encoding and decoding apparatuses are presented.
In a second embodiment, a method for embedding an auxiliary data bit in a plurality of samples of a digital composite signal includes the step of: (a.1) multiplying a least perceptually significant bit (LPSB) in each of the plurality of samples by a pseudo-random sequence to provide a corresponding plurality of multiplication values, and (a.2) accumulating the plurality of multiplication values to obtain a correlation value. The correlation value is the correlation of the PN sequence and the LPSB.
The method includes the further step of (b) comparing the correlation value to a value of the auxiliary data bit to determine a correspondence therebetween. If the comparing step (b) indicates an undesired correspondence, at least one of the LPSBs is toggled to provide the desired correspondence, and the plurality of samples with the at least one toggled LPSB is used to provide a composite signal where the LPSBs, including the at least one toggled LPSB, identify the auxiliary data bit.
If the comparing step (b) indicates a desired correspondence, the plurality of samples is passed through with the associated LPSBs unchanged to provide a composite signal where the unchanged LPSBs identify the auxiliary data bit.
A corresponding decoding method, and encoding and decoding apparatuses are presented.
A data signal embodied in a carrier wave is also presented. The data signal includes a primary data signal portion comprising a plurality of samples, and an auxiliary data sequence portion.
In one embodiment, an auxiliary data bit modulates a pseudo-random sequence to provide the auxiliary data sequence portion. The auxiliary data sequence modifies at least one LPSB of each of the plurality of samples. Moreover, a number of LPSBs that is modified in each of the samples is determined according to a desired perceptibility level of the auxiliary data bit in the composite signal.
In another embodiment of the data signal, an LPSB in each of the plurality of samples is multiplied by a pseudo-random sequence to provide a corresponding plurality of multiplication values. The values are accumulated to obtain a correlation value, and the correlation value is compared to a value of the auxiliary data bit to determine a correspondence therebetween. At least one of the LPSBs is toggled to provide the desired correspondence.