The present invention relates to the field of digital audio signal processing, and in particular to techniques of watermarking a digital audio signal.
The recent growth of networked multimedia systems has significantly increased the need for the protection of digital media. This is particularly important for the protection and enhancement of intellectual property rights. Digital media includes text, software, and digital audio, video and images. The ubiquity of digital media available via the Internet and digital library applications has increased the need for new techniques of digital copyright protection and new measures in data security. Digital watermarking is a developing technology that attempts to address these growing concerns. It has become an area of active research in multimedia technology.
A digital watermark is an invisible structure that is embedded in a host media signal. Therefore, watermarking, or data hiding, refers to techniques for embedding such a structure in digital data. It is an application that embeds the least amount of data, but contrarily requires the greatest robustness. To be effective, a watermark should be inaudible or invisible within its host signal. Further, it should be difficult or impossible to remove by unauthorised access, yet be easily extracted by the owner or authorised person. Finally, it should be robust to incidental and/or intentional distortions, including various types of signal processing and geometric transformation operations.
Many watermarking techniques have been proposed for text, images and video. They mainly focus on the invisibility of the watermark and its robustness against various signal manipulations and hostile attacks. These techniques can be grouped into two categories: spatial domain methods and frequency domain methods.
In relation to text, image and video data, there is a current trend towards approaches that make use of information about the human visual system (HVS) in an attempt to produce a more robust watermark. Such techniques use explicit information about the HVS to exploit the limited dynamic range of the human eye.
Compared with the development of digital video and image watermarking techniques, watermarking digital audio provides special challenges. The human auditory system (HAS) is significantly more sensitive than HVS. In particular, the HAS is sensitive to a dynamic range for amplitude of one billion to one and for frequency of one thousand to one. Sensitivity to additive random noise is also acute. Perturbations in a sound file can be detected as low as one part in ten million (80 dB below ambient level).
Generally, the limit of perceptible noise increases as the noise content of a host audio signal increases. Thus, the typical allowable noise level remains very low.
Therefore, there is clearly a need for a system of watermarking digital audio data that is inaudible and robust at the same time.
In accordance with a first aspect of the invention, there is disclosed a method of embedding a watermark in a digital audio signal. The method includes the step of: embedding at least one echo dependent upon the watermark in a portion of the digital audio signal, predefined characteristics of the at least one echo being dependent upon time and/or frequency domain characteristics of the portion of the digital audio signal to provide a substantially inaudible and robust embedded watermark in the digital audio signal.
Preferably, the method includes the step of digesting the digital audio signal to provide a watermark key, the watermark being dependent upon the watermark key. It may also include the step of encrypting predetermined information using the watermark key to form the watermark.
Preferably, the method includes the step of generating the at least one echo to have a delay and an amplitude relative to the digital audio signal that is substantially inaudible. The value of the delay and the amplitude are programmable.
Two or more echoes can be programmably sequenced having different delays and/or amplitudes. Two portions of the digital audio signal can be embedded with different echoes dependent upon the time and/or frequency characteristics of the digital audio signal.
In accordance with a second aspect of the invention, there is disclosed an apparatus for embedding a watermark in a digital audio signal. The apparatus includes: a device for determining time and/or frequency domain characteristics of the digital audio signal; and a device for embedding at least one echo dependent upon the watermark in a portion of the digital audio signal, predefined characteristics of the at least one echo being dependent upon the time and/or frequency domain characteristics of the portion of the digital audio signal to provide a substantially inaudible and robust embedded watermark in the digital audio signal.
In accordance with a third aspect of the invention, there is disclosed a computer program product having a computer readable medium having a computer program recorded therein for embedding a watermark in a digital audio signal. The computer program product includes: a module for determining time and/or frequency domain characteristics of the digital audio signal; and a module for embedding at least one echo dependent upon the watermark in a portion of the digital audio signal, predefined characteristics of the at least one echo being dependent upon the time and/or frequency domain characteristics of the portion of the digital audio signal to provide a substantially inaudible and robust embedded watermark in the digital audio signal.
In accordance with a fourth aspect of the invention, there is disclosed a method of embedding a watermark in a digital audio signal. The method includes the steps of: generating a digital watermark; adaptively segmenting the digital audio signal dependent upon at least one frequency and/or time domain characteristic into two or more frames containing respective portions of the digital audio signal; classifying each frame dependent upon at least one frequency and/or time domain characteristic of the portion of the digital audio signal in the frame; and embedding at least one echo in at least one of the frames, the echo being dependent upon the watermark and upon a classification of each frame determined by the classifying step, whereby a watermarked digital audio signal is produced.
Preferably, the watermark is dependent upon the digital audio signal. The method may also include the steps of: audio digesting the digital audio signal to provide an audio digest; and encrypting watermark information dependent upon the audio digest.
Preferably, the method further includes the step of extracting one or more features from each frame of the digital audio signal. It may also include the step of selecting an embedding scheme for each frame dependent upon the classification of each frame, the embedding scheme adapted dependent upon at least one time and/or frequency domain characteristic of the classification for the corresponding portion of the digital audio signal. Still further, the method may further include the step of embedding the at least one echo in at least one of the frames dependent upon the selected embedding scheme. The amplitude and the delay of the echo relative to the corresponding portion of the digital audio signal in the frame is defined dependent upon the embedding scheme so as to be inaudible. Optionally, at least two echoes are embedded in the frame.
Preferably, two or more echoes embedded in the digital audio signal are dependent upon a bit of the watermark.
In accordance with a fifth aspect of the invention, there is disclosed an apparatus for embedding a watermark in a digital audio signal. The apparatus includes: a device for generating a digital watermark; a device for adaptively segmenting the digital audio signal dependent upon at least one frequency and/or time domain characteristic into two or more frames containing respective portions of the digital audio signal; a device for classifying each frame dependent upon at least one frequency and/or time domain characteristic of the portion of the digital audio signal in the frame; and a device for embedding at least one echo in at least one of the frames, the echo being dependent upon the watermark and upon a classification of each frame determined by the classifying device, whereby a watermarked digital audio signal is produced.
In accordance with a sixth aspect of the invention, there is disclosed a computer program product having a computer readable medium having a computer program recorded therein for embedding a watermark in a digital audio signal. The computer program product includes: a module for generating a digital watermark; a module for adaptively segmenting the digital audio signal dependent upon at least one frequency and/or time domain characteristic into two or more frames containing respective portions of the digital audio signal; a module for classifying each frame dependent upon at least one frequency and/or time domain characteristic of the portion of the digital audio signal in the frame; and a module for embedding at least one echo in at least one of the frames, the echo being dependent upon the watermark and upon a classification of each frame determined by the classifying device, whereby a watermarked digital audio signal is produced.
In accordance with a seventh aspect of the invention, there is disclosed a method of extracting a watermark from a watermarked digital audio signal. The method includes the steps of: adaptively segmenting the watermarked digital audio signal into two or more frames containing corresponding portions of the watermarked digital audio signal; detecting at least one echo present in the frames; and code mapping the at least one detected echo to extract an embedded watermark, the mapping being dependent upon one or more embedding schemes used to embed the at least one echo in the watermarked digital audio signal.
Preferably, the method further includes the step of audio registering the watermarked digital audio signal with the original digital audio signal to determine any unauthorised modifications of the watermarked digital audio signal.
Preferably, the method further includes the step of decrypting the embedded watermark dependent upon an audio digest signal to derive watermark information, the audio digest signal being dependent upon an original digital audio signal.
In accordance with an eighth aspect of the invention, there is disclosed an apparatus for extracting a watermark from a watermarked digital audio signal. The apparatus includes: a device for adaptively segmenting the watermarked digital audio signal into two or more frames containing corresponding portions of the watermarked digital audio signal; a device for detecting at least one echo present in the frames; and a device for code mapping the at least one detected echo to extract an embedded watermark, the mapping being dependent upon one or more embedding schemes used to embed the at least one echo in the watermarked digital audio signal.
In accordance with an ninth aspect of the invention, there is disclosed a computer program product having a computer readable medium having a computer program recorded therein for extracting a watermark from a watermarked digital audio signal. The computer program product includes: a module for adaptively segmenting the watermarked digital audio signal into two or more frames containing corresponding portions of the watermarked digital audio signal; a module for detecting at least one echo present in the frames; and a module for code mapping the at least one detected echo to extract an embedded watermark, the mapping being dependent upon one or more embedding schemes used to embed the at least one echo in the watermarked digital audio signal.