This invention relates to embedding data into an object that comprises a collection of samples. The samples include a digital representation of an image; audio, video, and other binary files, such as synthetic aperture radar (SAR) images; three-dimensional representations of spatial structures; etc. The original object before embedding is called the cover object; the object with embedded data is called the stego-object.
Applications that embed data can be divided into two groups, depending on the relationship between the embedded message and the cover object. The first group is steganographic applications, where the message has no relationship to the cover object. The only role for the cover object is to mask the very presence of communication. The content of the cover object has no value to either the sender or the decoder. It functions only to mask an embedded secret message. In a steganographic application for covertly communicating, the receiver has no interest in the original object. Thus such applications do not need lossless techniques for embedding data.
There is, however, a second group of applications in which the cover object is itself of interest. To distort the original object permanently by embedding data into it is unacceptable. Either the distortion must be eliminated or a technique found that restores the original object after the data is embedded.
This second group includes digital watermarking, watermarking for authentication and tamper detection, watermarking for distribution and access control, watermarking for broadcast monitoring, fingerprinting, and image augmentation. In a typical watermarking application, the hidden message has a close relation to the cover object. The hidden message may supply additional information about the cover object, e.g., its caption, ancillary data about its origin, author, sender, or recipient, a digital signature, an authentication code, etc.
Though hiding a message in the object increases its practical value, the act of embedding inevitably introduces some distortion. This distortion should be as small as possible consistent with meeting other requirements, such as minimal robustness and sufficient payload. Employing models of the human visual or audio system helps make the distortion from embedding less detectable to a human.
There are, however, some applications for which any distortion of the object is unacceptable, no matter how minimal. A good example is a medical image, where even the smallest modification cannot be allowed, both for legal reasons and to eliminate a potential risk that a physician will misinterpret an image. Other examples come from law enforcement and the military, where analysts inspect images and videos under special conditions. Under these conditions, which include extreme zoom, iterative filtering, and enhancement, common assumptions about the effects of distortion on visibility do not apply. Only a complete absence of distortion can satisfy the requirements placed on such an image.
Techniques for embedding data, especially high-capacity data, generally introduce some distortion into the original object. Such distortion is permanent; it cannot be reversed. As an example, take simple Least Significant Bit (LSB) embedding, which irreversibly replaces the LSB plane with the message bits.
The concept of embedding data losslessly appears in a patent assigned to The Eastman Kodak Company (Honsinger et al., Lossless Recovery of an Original Image Containing Embedded Data, U.S. Pat. No. 6,278,791, issued Aug. 21, 2001). The inventors describe a fragile invertible method of authentication based on a robust watermark in the spatial domain. Their technique for watermarking is spatial, additive, and non-adaptive; the lossless embedding was achieved by replacing regular addition by addition modulo 256. This type of addition will, however, introduce some disturbing artifacts that resemble a correlated salt-and-pepper noise when pixels with grayscales close to zero are modified to values close to 255 and vice versa. Another drawback of this technique is that its payload must be very small. Thus this technique is not suitable for general data embedding. Finally, the technique is not easily expandable to other image formats and different data types (audio, for example). A more detailed analysis and further generalization of this technique can be found in J. Fridrich et al., “Invertible Authentication,” Proc. SPIE, Security and Watermarking of Multimedia Contents (San Jose, Calif., January 2001).
A different technique for lossless authentication and lossless embedding of data, based on lossless compression of bit-planes, starts with the lowest bit-plane and calculates its redundancy, defined as the difference between the number of pixels and the same bit-plane compressed with the JBIG lossless compression method (see K. Sayood, Introduction to Data Compression (San Francisco, 1996), 87–94) or some other method. Then the embedding method proceeds to higher bit-planes till the redundancy becomes greater or equal to the payload that needs to be embedded. If this technique is used for authentication, only 128 bits (for MD5 hash—see Bruce Schneier, Applied Cryptography, 2 ed. (NY, 1996)) need to be embedded. Most high quality images can be authenticated in the lowest three bit-planes. Noisy images may require the 4th or the 5th bit-plane.
The capacity of this technique can be traded for distortion by choosing different bit-planes, but the artifacts can quickly become visible depending on the length of the message and the noisiness of the original image. Overall, the method provides only small payloads and is not suitable for general data embedding.
Macq described a modification to the patchwork algorithm to achieve lossless embedding of a watermark. He also used addition modulo 256 and essentially embedded a one-bit watermark. It is unclear if this technique could be used for authentication or general data embedding with practical payloads. (B. Macq, “Lossless Multiresolution Transform for Image Authenticating Watermarking” Proc. EUSIPCO (Tampere, Finland, September 2000)).
Thus there is a need for simple, high-capacity techniques that do not introduce visible artifacts and, at the same time, remove the distortion inherent in the embedding of a hidden message in a cover object, where the cover object itself is the object of interest. It is also important that the techniques be general enough to apply to all object types, including images, video, audio, and other binary files comprising digital samples. In the case of digital images, the technique should be applicable to all image formats, including uncompressed formats, such as BMP, PGM, PCX, etc., palette formats, such as GIF, PNG, and lossy formats, such as JPEG, JPEG2000, wavelet formats, fractal formats, etc.