This invention relates to protecting audio content by using watermarks. More particularly, this invention relates to improved techniques for detecting watermarks in an audio signal.
Since the earliest days of human civilization, music has existed at the crossroads of creativity and technology. The urge to organize sound has been a constant part of human nature while the tools to make and capture the resulting music have evolved in parallel with human mastery of science.
Throughout the history of audio recordings, the ability to store and transmit audio (such as music) has quickly evolved since the early days just 130 years ago. From Edison""s foil cylinders to contemporary technologies (such as DVD-Audio, MP3, and the Internet), the constant evolution of prerecorded audio delivery has presented both opportunity and challenge.
Music is the world""s universal form of communication, touching every person of every culture on the globe. Behind the music is a growing multi-billion dollar per year industry. This industry, however, is constantly plagued by lost revenues due to music piracy.
Protecting Rights
Piracy is not a new problem. However, as technologies change and improve, there are new challenges to protecting music content from illicit copying and theft. For instance, more producers are beginning to use the Internet to distribute music content. In this form of distribution, the content merely exists as a bit stream which, if left unprotected, can be easily copied and reproduced.
At the end of 1997, the International Federation of the Phonographic Industry (IFPI), the British Phonographic Industry, and the Recording Industry Association of America (RIAA) engaged in a project to survey the extent of unauthorized use of music on the Internet. The initial search indicated that at any one time there could be up to 80,000 infringing MP3 files on the Internet. The actual number of servers on the Internet hosting infringing files was estimated to 2,000 with locations in over 30 countries around the world. Since that survey, the availability of and interest in the digital music on the Internet has increased many times over.
Each day, the wall impeding the reproduction and distribution of infringing digital audio clips (e.g., music files) gets shorter and weaker. xe2x80x9cNapsterxe2x80x9d is an example of an application that is weakening the wall of protection. It gives individuals access to one another""s MP3 files by creating a unique file-sharing system via the Internet. Thus, it encourages illegal distribution of copies of copyrighted material.
As a result, these modern digital pirates effectively rob artists and authors of their lawful compensation. Unless technology provides for those who create music to be compensated for it, both the creative community and the musical culture at large will be impoverished.
Identifying a Copyrighted Work
Unlike tape cassettes and CDs, a digital music file has no jewel case, label, sticker, or the like on which to place the copyright notification and the identification of the author. A digital music file is a set of binary data without a detectible and unmodifiable label.
Thus, musical artists and authors are unable to inform the public that a work is protected by adhering a copyright notice to the digital music file. Furthermore, such artists and authors are unable to inform the public of any addition information, such as the identity of the copyright holder or terms of a limited license.
Digital Tags
The music industry and trade groups are especially concerned by digital recording because there is no generation loss in digital transfersxe2x80x94a copy sounds the same as the original. Without limits on unauthorized copying, a digital audio recording format could easily encourage the pirating of master-quality recordings.
One solution is to amend an associated digital xe2x80x9ctagxe2x80x9d with each audio file that identified the copyright holder. To implement such a plan, all devices capable of such digital reproduction must faithfully reproduce the amended, associated tag.
With the passage of the Audio Home Recording Act of 1992, inclusion of serial copying technology became law in the United States. This legislation mandated the inclusion of serial copying technology, such as SCMS (Serial Copy Management System), in consumer digital recorders. SCMS recognizes a xe2x80x9ccopyright flagxe2x80x9d encoded on a prerecorded original (such as a CD), and writes that flag into the subcode of digital copies (such as a transfer from a CD to a DAT tape). The presence of the flag prevents an SCMS-equipped recorder from digitally copying the copy, thus breaking the chain of perfect digital cloning.
However, subsequent developmentsxe2x80x94both technical and legalxe2x80x94have demonstrated the limited benefits of this legislation. While digital-secure-music-delivery systems (such as SCMS) are designed to support the rights of content owners in the digital domain, the problem of analog copying requires a different approach. In the digital domain, information about the copy status of a given piece of music may be carried in the subcode, which is separate information that travels along with the audio data. In the analog domain, there is no subcodexe2x80x94the only place to put the extra information is to hide it within the audio signal itself.
Digital Watermarks
Techniques for identifying copyright information of digital audio content that address both analog and digital copying instances have received a great deal of attention in both the industrial community and the academic environment. One of the most promising xe2x80x9cdigital labelingxe2x80x9d techniques is amalgamation of a digital watermark into the audio signal itself by altering the signal""s frequency spectrum such that the perceptual characteristics of the original recording are preserved. In other words, a watermark is clandestinely integrated with an audio clip so that when copied, the watermark will be reproduced along with the clip itself.
In general, a xe2x80x9cdigital watermarkxe2x80x9d is a pattern of bits inserted into a digital representation (i.e., signal or file) of content (i.e., an image, audio, video, or the like) that identifies the content""s copyright information (e.g., author, rights, etc.). The name comes from the faintly visible watermarks imprinted on stationery that identify the manufacturer of the stationery. The purpose of digital watermarks is to provide copyright protection for intellectual property that is in digital format.
Unlike printed watermarks, which are intended to be somewhat visible, digital watermarks are designed to be completely invisible, or in the case of audio clips, inaudible. That is invisible to all except a specifically designed watermark detector. Moreover, the actual bits representing the watermark are typically scattered throughout the file in such a way that they cannot be identified and manipulated. Finally, the digital watermark should be robust enough so that it can withstand normal changes to the file, such as reductions from lossy compression algorithms.
Satisfying all these requirements is no easy feat, but there are several competing technologies. All of them work by making the watermark appear as noisexe2x80x94that is, random data that exists in most digital files anyway. To view a watermark, you need a special program or device (i.e., a xe2x80x9cdetectorxe2x80x9d) that knows how to extract the watermark data.
Herein, such a digital watermark may be simply called a xe2x80x9cwatermark.xe2x80x9d Generically, it may be called an xe2x80x9cinformation pattern of discrete valuesxe2x80x9d or a xe2x80x9cdata pattern of discrete values.xe2x80x9d The audio signal (or clip) in which a watermark is encoded is effectively xe2x80x9cnoisexe2x80x9d in relation to the watermark.
Watermarking
Watermarking gives content owners a way to self-identify each track of music, thus providing proof of ownership and a way to track public performances of music for purposes of royalty distribution. It may also convey instructions, which can be used by a recording or playback device, to determine whether and how the music may be distributed. Because that data can be read even after the music has been converted from digital to an analog signal, watermarking can be a powerful tool to defeat analog circumvention of copy protection.
The general concept of watermarking has been around for at least 30 years. It was used by companies (such as Muzak(trademark)) to audibly identify music delivered through their systems. Today, however, the emphasis in watermarking is on inaudible approaches. By varying signals embedded in analog audio programs, it is possible to create patterns that may be recognized by consumer electronics devices or audio circuitry in computers.
For general use in the record industry today, watermarking must be completely inaudible under all conditions. This guarantees the artistic integrity of the music. Moreover, it must be robust enough to survive all forms of attacks. To be effective, watermarks must endure processing, format conversion, and encode/detect cycles that today""s music may encounter in a distribution environment that includes radio, the Web, music cassettes, and other non-linear media. In addition, it must endure malevolent attacks by digital pirates.
Watermark Encoding
Typically, existing techniques for encoding a watermark within discrete audio signals facilitate the insensitivity of the human auditory system (HAS) to certain audio phenomena. It has been demonstrated that, in the temporal domain, the HAS is insensitive to small signal level changes and peaks in the pre-echo and the decaying echo spectrum.
The techniques developed to facilitate the first phenomenon are typically not resilient to de-synch attacks. Due to the difficulty of the echo cancellation problem, techniques that employ multiple decaying echoes to place a peak in the signal""s cepstrum can hardly be attacked in real-time, but fairly easy using an off-line exhaustive search. (The term xe2x80x9ccepstrumxe2x80x9d is the accepted terminology for the Fourier transform of the logarithm of the power spectrum of a signal.)
Watermarking techniques that embed secret data in the frequency domain of a signal facilitate the insensitivity of the HAS to small magnitude and phase changes. In both cases, a publisher""s secret key is encoded as a pseudo-random sequence that is used to guide the modification of each magnitude or phase component of the frequency domain. The modifications are performed either directly or shaped according to the signal""s envelope.
In addition, watermarking schemes have been developed which facilitate the advantages but also suffers from the disadvantages of hiding data in both the time and frequency domain. It has not been demonstrated whether spread-spectrum watermarking schemes would survive combinations of common attacks: de-synchronization in both the temporal and frequency domain and mosaic-like attacks.
Watermark Detection
The watermark detection process is performed by synchronously correlating the suspected audio clip with the watermark of the content publisher. A common pitfall for all watermarking systems that facilitate this type of data hiding is intolerance to desynchronization attacks (e.g., sample cropping, insertion, repetition, variable pitch-scale and time-scale modifications, audio restoration, and arbitrary combinations of these attacks) and deficiency of adequate techniques to address this problem during the detection process.
Furthermore, it is desirable to have a highly accurate, quick, and efficient watermark detection system. When detecting a watermark, the content of the clip (e.g., music) is merely noise in relation to the watermark. Therefore, this xe2x80x9cnoisexe2x80x9d hinders with such accurate, quick, and efficient watermark detection. However, of course, the watermark""s purpose is to protect this xe2x80x9cnoise.xe2x80x9d
Moreover, the mere act of accurately detecting a watermark in a signal may aid a digital pirate in empirically ascertaining the watermark. Conventionally, this risk is considered small and too difficult to address; therefore, the industry lives with this risk.
Desiderata of Watermarking Technology
Watermarking technology has several highly desirable goals (i.e., desiderata) to facilitate protection of copyrights of audio content publishers. Below are listed several of such goals.
Perceptual Invisibility. The embedded information should not induce audible changes in the audio quality of the resulting watermarked signal. The test of perceptual invisibility is often called the xe2x80x9cgolden earsxe2x80x9d test.
Statistical Invisibility. The embedded information should be quantitatively imperceptive for any exhaustive, heuristic, or probabilistic attempt to detect or remove the watermark. The complexity of successfully launching such attacks should be well beyond the computation power of publicly available computer systems.
Tamperproofness. An attempt to remove the watermark should damage the value of the music well above the hearing threshold.
Cost. The system should be inexpensive to license and implement on both programmable and application-specific platforms.
Non-disclosure of the Original. The watermarking and detection protocols should be such that the process of proving audio content copyright both in-situ and in-court, does not involve usage of the original recording.
Enforceability and Flexibility. The watermarking technique should provide strong and undeniable copyright proof. Similarly, it should enable a spectrum of protection levels, which correspond to variable audio presentation and compression standards.
Resilience to Common Attacks. Public availability of powerful digital sound editing tools imposes that the watermarking and detection process is resilient to attacks spawned from such consoles. The standard set of plausible attacks is itemized in the Request for Proposals (RFP) of IFPI (International Federation of the Phonographic Industry) and RIAA (Recording Industry Association of America). The RFP encapsulates the following security requirements:
two successive D/A and A/D conversions,
data reduction coding techniques such as MP3,
adaptive transform coding (ATRAC),
adaptive subband coding,
Digital Audio Broadcasting (DAB),
Dolby AC2 and AC3 systems,
applying additive or multiplicative noise,
applying a second Embedded Signal, using the same system, to a single program fragment,
frequency response distortion corresponding to normal analogue frequency response controls such as bass, mid and treble controls, with maximum variation of 15 dB with respect to the original signal, and
applying frequency notches with possible frequency hopping.
Watermark Circumvention
If the encoding of a watermark can thwart a malicious attack, then it can avoid the harm of the introduction of unintentional noise. Therefore, any advancement in watermark technology that makes it more difficult for a malevolent attacker to assail the watermark also makes it more difficult for a watermark to be altered unintentionally.
In general, there are two common classes of malevolent attacks:
1. De-synchronization of watermark in digital audio signals. These attacks alter audio signals in such a way to make it difficult for the detector to identify the location of the encoded watermark codes.
2. Removing or altering the watermark. The attacker discovers the location of the watermark and intentionally alters the audio clip to remove or deteriorate a part of the watermark or its entirety.
Framework to Thwart Attacks
Accordingly, there is a need for a framework of protocols for hiding watermarks in digital audio signals that are effective against malevolent attacks. The framework should also be flexible to enable a spectrum of protection levels, which correspond to variable audio presentation and compression standards, and yet resilient to common attacks spawned by powerful digital sound editing tools.
However, such a framework should support quick, efficient, and accurate detection of watermarks by a specifically designed watermark detector. Moreover, it is desirable for such a framework to minimize false indications of a watermark""s presence or absence. Furthermore, it is best if the act of detection does not provide decipherable clues to a digital pirate as to the value or location of the embedded watermark.
Described herein is an audio watermarking technology for detecting watermarks in audio signals, such as a music clip. The watermark identifies the content producer, providing a signature that is embedded in the audio signal and cannot be removed. The watermark is designed to survive all typical kinds of processing and all types of malicious attacks that attempt to remove or modify the watermark from the signal. The implementations of the watermark detecting system, described herein, support quick, efficient, and accurate detection of watermarks by the specifically designed watermark detecting system.
In one described implementation, a watermark detecting system employs a cardinality-scaled correlation (CSC) test to determine the presence of a watermark using less expensive materials (hardware), quicker calculations, and a more accurate test (than the original correlation test).
In other described implementations, a watermark detecting system employs a cepstrum filter and dynamic processing to minimize the affect of the xe2x80x9cnoisexe2x80x9d in the watermarked signal. The xe2x80x9cnoisexe2x80x9d is the original content of the signal before such signal was watermarked.
In still another described implementation, a watermark detecting system employs a mechanism for random detection threshold so that the act of watermark detection does not provide decipherable clues to a digital pirate as to the value or location of the embedded watermark.
This summary itself is not intended to limit the scope of this patent. Moreover, the title of this patent is not intended to limit the scope of this patent. For a better understanding of the present invention, please see the following detailed description and appending claims, taken in conjunction with the accompanying drawings. The scope of the present invention is pointed out in the appending claims.