With the growth of multimedia systems in distributed environments, issues such as copy control, illegal distribution, copyright protection, covert communications, etc., have become important. Digital data hiding schemes have been proposed in recent years as a viable way of addressing some of these security concerns. Digital data hiding is one aspect of the larger field of steganography, which is concerned with the hiding of one media type within another, such as text, voice, or another image into an image or video. The potential customers having the greatest interest in steganography, particularly digital data hiding, include the entertainment industry, where digital data hiding would be used for copyright protection and electronic fingerprinting, and defense agencies, where digital data hiding would be used for covert communications and authentication of documents.
In conventional watermarking of paper documents, an ink-based image is embedded in the larger document. When authenticating the document, holding the document up to a light source reveals the faint traces of the watermark, which may or may not be visible under normal lighting conditions. For the entertainment or defense industries, it becomes increasingly important not to allow digital hidden data (e.g., a digital watermark) to be easily visible in a stego-image (a digital image with hidden data). Thus, one important property of a good digital data hiding technique is to keep the distortion of a host image (the original image) to a minimum. The digital hidden data must also be relatively immune to both intentional and unintentional attacks by those for whom the hidden data was not intended, such as counterfeiters and computer hackers. Some examples of attacks include digital-to-analog conversion, analog-to-digital conversion, requantization, dithering, rotation, scaling, and cropping. One of these types of attacks, scaling, is particularly insidious because of its ability to cause minimal perceptible differences between the original and the attacked overall image yet cause severe loss of the hidden data.
Scaling attacks generally refer to the reduction or expansion in the size of a stego-image, and come in two categories as is known in the art:
Category 1—The scaled image is the same size as the stego-image. This means a down-scaling operation is followed by an up-scaling operation or vice versa.
Category 2—The scaled image is not the same size as the stego-image (either up-scaled or down-scaled).
Probably the most common category of data hiding techniques used against both Category 1 and Category 2 attacks are based on the discrete cosine transform (DCT), as is known in the art. The 2D-DCT (2-dimensional DCT) data hiding technique used most often, and virtually a standard, is that described in J. R. Hernandez et al., “DCT-Domain Watermarking Techniques for Still Images: Detector Performance Analysis and a New Structure,” IEEE Transactions on Image Processing, January 2000, Vol. 9, pp. 55-68 (the generic data hiding and hidden data extraction methods). In this scheme, 8×8 pixel blocks of the host image are first transformed using the 2D-DCT, and the mid-frequency regions of the 2D-DCT coefficient blocks are the locations where the hidden data is embedded. By using the mid-frequency regions for data hiding, the hidden data causes fewer distortions to the stego-image as compared to the low frequency regions, where most of the host image information is stored, while at the same time the hidden data would not be removed by compression schemes such as JPEG, where the high frequency regions of the 2D-DCT coefficients are thrown away.
At the encoder end, the real values of the hidden data representing the grey scale or color pixel amplitudes (in the range of 0-255) are converted to binary form. If the current bit from the hidden data is a ‘1’, the ‘1’ bit is replaced by a real-valued pseudo-random noise (PN) sequence. A second PN sequence represents the ‘0’ bit. The two PN's are chosen to have minimal correlation with each other. The two PN's are known on both the encoding end and extracting end of the communications channel over which the stego-image is sent. The use of one PN representing an ‘0’ and another representing a ‘1’ minimizes the risk of misjudging a received ‘1’ bit for an ‘0’ bit and vice versa when the communication channel is noisy.
The mid-band 2D-DCT coefficients, represented as real-valued data, are modulated with one of the PN sequences according to the following equations:Iw(u,v)=I(u,v)×(1+ks×Wb(u,v,)), u,v∈FM Iw(u,v)=I(u,v), u,v∉FM Wb is either the PN sequence for ‘0’, W0, or the PN sequence for ‘1’, W1; FM is the set of the coefficients of the 2D-DCT matrix block corresponding to mid-band frequencies; I(u, v) is an 8×8 2D-DCT block; ks is a gain factor used to specify the strength of the hidden data, and is adjusted according to the size of the particular 2D-DCT coefficient used (e.g., larger values of ks can be used for coefficients of higher magnitude and vice versa); and IW(u,v) represents the corresponding 2D-DCT block with hidden data. After all blocks of the host image have been processed, each block of the stego-image in the frequency domain is then inverse transformed to give the stego-image IW*(x,y), where x is the distance from the upper left-hand corner of the image along the x-axis, and y is the distance from the upper left-hand corner of the image along the y-axis.
In the generic hidden data extraction method, to extract the hidden data at the other end of the communications channel, the received stego-image (which may or may not have been attacked) is broken down into 8×8 blocks, and a 2D-DCT transformation is performed. Then the correlation between the mid-band 2D-DCT coefficients, IW and both the PN's, Wb, are calculated, where Wb is normalized to zero mean. If the correlation between the mid-band 2D-DCT coefficients and one of the PN sequences is higher than the other, then Hi, the ith reconstructed hidden data bit, is chosen according to the relation:Hi=1,corr(Iw,W1)>corr(Iw,W0)Hi=0,otherwisewhere corr( ) is the discrete correlation function. One way the discrete correlation function is implemented, as is known in the art, is to use a Matlab™ function corr2(a, b). Given the argument C=corr2(a, b), Matlab™ takes in two sequences a and b and returns a real value C which is the correlation coefficient of a and b. This value is less than or equal to +1, with +1 being 100% correlation.
The formula and example below illustrate how the coefficient is calculated for a specific sequence.Formula: C=sum[sum(a′.*b′)]/sqrt[sum{sum(a′.*a′)}*sum{sum(b′.*b′)}]Where, a′=a−mean(a)                b′=b−mean(b)        x.*y is the inner product of the vectors x and yThe usage of this formula can be illustrated with the following steps:            a=[1, 0, 0, 0, 1, 1, 1, 0]    b=[0, 0, 1, 0, 1, 1, 1, 1]Step 1:    mean(a)=(1+0+0+0+1+1+1+0)/8=0.5    mean(b)=(0+0+1+0+1+1+1+1)/8=0.625Step 2:    a′=a−mean(a)=[0.5, −0.5, −0.5, −0.5, 0.5, 0.5, 0.5, −0.5]    b′=b−mean(b)=[−0.625, −0.625, 0.375, −0.625, 0.375, 0.375, 0.375, 0.375]Step 3:    a′.*b=[−0.3125, 0.3125, −0.1875, 0.3125, 0.1875, 0.1875, 0.1875, −0.1875]    a′.*a=[0.25, 0.25, 0.25, 0.25, 0.25, 0.25, 0.25, 0.25]    b′.*b=[0.39062, 0.39062, 0.14062, 0.39062, 0.14062, 0.14062, 0.14062, 0.14062]Step 4:    sum[sum(a′.*b′)]=0.5    sum[sum(a′.*a′)]=2.0    sum[sum(b′.*b′)]=1.875Step 5:    sum[sum(a′.*a′)]*sum[sum(b′.*b′)]=(2.0*1.875)=3.75Step 6:    Sqrt(sum[sum(a′.*a′)]*sum[sum(b′.*b′)])=(2.0*1.875)=1.936492Step 7:    sum[sum(a′.*b′)]/sqrt[sum{sum(a′.*a′)}*sum{sum(b′.*b′)}]=0.5/1.93649=0.2582
The main problem with the generic extraction method is that it performs poorly against Category 2 attacks. Down-scaling the stego-image by a mere 4% or up-scaling it by a factor of 5% causes the generic extraction method to lose 50% of the hidden data. This is equivalent to just assuming all extracted bits are ‘1’, and thus the extracted hidden data is unrecognizable.
The main cause of the loss of hidden data is that when the stego-image is scaled (i.e., resized), there will be fewer or greater 8×8 bocks of pixels to be examined in the attacked image compared to the un-scaled stego-image. If the scaled stego-image is partitioned into 8×8 blocks, then the spatial information in each block of the scaled stego-image is not the same as that in the corresponding block of the un-scaled stego-image. If the image size is reduced, then each 8×8 pixel block in the scaled image contains more hidden data information than the corresponding block of the un-scaled stego-image. As such, scaling destroys the one-to-one mapping between the blocks of the attacked image and the corresponding blocks in the un-scaled stego-image.