Advances in information systems and networked databases continue to spur rapid growth in digital media, e.g., audio, image and video. This is due, in part, to highly efficient manipulation, reproduction, and access afforded by digital media. Data hiding is the process of encoding extra information in digital data, such as video, images or sounds, by making small modifications to the data. Hiding information in images may be used to supplement an image or sound with additional information, or verify the integrity of the image or sound. The hidden information itself may be text, audio or image data or hyperlinks. For example, text captions may be used to label faces and buildings in an image. A short audio clip may associate a train whistle with an image of a locomotive. A hyperlink may join an image region to another document or data source.
The embedded data typically remains with the image when it is stored or transmitted. The embedded data may be meant to be extracted by an end user, or hidden to the end user. In the former instance, for example, a consumer may extract the embedded data and use it to satisfy an information need. In the latter instance, the embedded data may be a watermark. Watermarking is a technique used to label digital media by hiding copyright or other information into the underlying data. Unlike encryption, for example, which is used to restrict access to data, watermarking is employed to provide solid proof of authorship. Like data hiding generally, the watermark remains with the media. However, unlike data hiding generally, with watermarking the user cannot access the embedded information (i.e., the watermark).
Data hiding in general, and watermarking in particular, typically must satisfy the following requirements to be useful: they must be invisible, and they must be robust. Although other criteria may be important (such as statistical invisibility, the support for multiple data embeddings and self-clocking), the invisibility and the robustness of the resulting data are most important. The first requirement is that the hidden data remain invisible in the case where the host data is image data.
Otherwise, the quality of the image may degrade.
The second requirement, robustness, relates to the survivability of the hidden data in light of the manipulation of the media in which it is embedded. Typically, image data is subject to signal processing operations such as filtering, resampling, compression, noise, cropping, audio-to-digital and subsequent digital-to-audio conversion, etc. For example, a small section of an image may be cropped so only that section is used. An image may also be compressed by a technique such as JPEG so that its transmission is completed in a shorter period of time. Because the host data will invariably be subject to such manipulation, the embedded data must be robust. That is, the embedded data must able to survive after the host data has been subjected to signal processing operations.
Several data hiding techniques are found in the prior art. The most common approaches modify the least significant bits (LSB) of an image based on the assumption that the LSB data are insignificant. In one particular technique, the LSB of data is replaced with a pseudo-noise (PN) sequence, while in another technique, a PN sequence is added to the LSB of the data. A data hiding method called "Patchwork" for image data chooses n pairs (a.sub.i, b.sub.i) of points within an image and increase the brightness of a.sub.i by one unit while simultaneously decreasing the brightness of b.sub.i. However, any approach which only modifies the LSB data is highly sensitive to noise and is easily destroyed. Furthermore, image quality may be degraded by the hidden data.
Thus, there is a need for a data hiding and watermarking technique that is invisible in the case of image data and has the maximum robustness to ensure that the embedded data survives both legitimate and illegitimate data manipulation.