1. Field of the Invention
The invention relates to image watermarking, particularly a technique, both apparatus and an accompanying method, for generating a highly secure cryptographic identifier, i.e., a watermark, for a non-marked image and embedding that watermark within the non-marked image itself in order to generate a xe2x80x9cwatermarkedxe2x80x9d image; for subsequently detecting that watermark in a test image; and the watermarked image so generated. By detecting whether an appropriate watermark is present or not in the test image, an image owner can readily, accurately and automatically determine whether the test image is a duplicate of the non-marked image.
2. Description of the Prior Art
Images has always seen widespread use as a form of human communication, whether for education, entertainment, art or otherwise. Information can be conveyed in a single image far more efficiently and with significantly greater impact to its viewer than if the same information were to be described in textual form.
Historically, and even as recently as a decade or so ago, equipment for electronically scanning, storing and manipulating images was rather expensivexe2x80x94which, in situations where cost was a prime concern, tended to limit the use of images to printed media. However, this is no longer the case. The widespread use and adoption of personal computers (PCs) and associated peripheral equipment, coupled with their continually decreasing price and increasing sophistication, has led to a revolution in electronic communication, particularly including imagery. Image processing equipment, such as sufficiently high resolution color scanners (e.g., 1200 dpi (dots/inch)), software for accurately manipulating and processing image data, and color printing devices that are capable of rendering satisfactory output images, that was once cost-prohibitive for all but professional advertisers, graphics artists and publishers, is now affordable for a significant number of PC users. As such, individuals and businesses alike are now purchasing such equipment, with the result being that images, now being cost-effective and rather easy to electronically handle, process and manipulate, is seeing explosive use in electronic communication.
Nowhere is this effect more apparent than in the World Wide Web. Web site owners are increasingly incorporating image data into their web pages for dissemination to their visitors.
However, not unexpectedly, with the widespread use of images comes a growing threat of image piracy and image counterfeiting. An electronic image file, being digital in nature, can be duplicated just as easily as any other digital file can. Hence, image owners are increasingly noticing that their images are being illicitly duplicated and disseminated. This is particularly prevalent with web site imagery where image files, once having been downloaded by a web server to a third party client browser for local display, can themselves be readily extracted from a web page, saved and copied. Frequently, an image obtained in this fashion by a third party(ies) from one web site(s) are being incorporated by that party into a web site(s) (s)he maintains or otherwise disseminated by that party. Such copying, where the image is not in the public domain, effectively frustrates the owner of the image in seeking rightful compensation for use of that image.
Various techniques are widely known in the art to reduce the incidence of illicit copying on the web, However, all of them are deficient to some extent.
A first technique utilizes an automatic approach. Here, an image owner utilizes a web crawler to successively visit one web site after another. For each site being visited, the crawler downloads corresponding image files for all images available at that site and compares each such file against stored data for each image owned by that person to detect whether any of the former images is a copy of any of the latter images, and if such a copy is found, provides appropriate notification to the image owner. Unfortunately, as a result of various comparison algorithms that could be used, a relatively slight change to an image can defeat a finding of similarity between it and another imagexe2x80x94even though to an human observer the two images are, visually speaking, very similar. Hence, this technique, being rather easy to frustrate, has proven to be inadequate.
A second technique relies on a manual approach. Simply stated, a human observer could visit a web site and examine each image provided by that site against a set of images to determine any matches between the two. A human observer could provide necessary interpretative skills to find image similarity where a comparison algorithm would not. However, at present, the number of sites accessible on the web is not only huge but also continues to exhibit exponential growth with no apparent decrease in its growth rate in sight. Hence, the sheer magnitude of the manual task of just visiting each and every web site, let alone comparing images accessible through each such site, renders this approach quite infeasible.
Another conventional technique that could be used relies on incorporating a watermark into an image and then detecting its presence in a suspected image copy. Here, to create the watermark, pixel values that collectively form an image are transformed into another domain, i.e., a spatial frequency domain, to yield a set of transform coefficient values. The watermark constitutes a set of pseudo-random perturbation values (xcex3) (generated through use of a secret key xe2x80x9ckxe2x80x9d), wherein each of these values is heuristically selected and lies within a predefined range. Each perturbation value is then added to its corresponding transform coefficient value to yield a xe2x80x9cwatermarkedxe2x80x9d image. To detect the watermark, the pixel values in a test image (i.e., a purported image copy) are transformed to yield transform coefficients which are themselves then tested, using the perturbation values, to detect the presence of the watermark.
Specifically, pixel values for an input image, I, to be watermarked are first transformed via, e.g., a DCT (discrete cosine transform), Fourier or wavelet transform, into the spatial frequency domain to yield transform coefficients. The top N coefficients containing approximately 90% of the image power, i.e., M1, M2, . . . , MN, are selected. Thereafter, given a secret xe2x80x9cseedxe2x80x9d value k, a sequence of N pseudo-random perturbation values xcex31, xcex32, . . . , xcex3N is generated.
To detect the watermark in a suspected image copy, that image is first transformed into a corresponding set of transform coefficients. Certain coefficients are then selected in the same manner set forth above to yield selected coefficients. The selected coefficients are then tested through a single mathematical test, in conjunction with the perturbation values that might have been used to mark the image, so as to detect the presence of these perturbation values, xcex3, in the suspected copy.
Unfortunately, this watermark-based approach, produces a rather insecure watermark which is likely to be quite susceptible to third party jamming.
Therefore, a need exists in the art for a technique for effectively detecting whether an image, such as one stored digitally, is a copy of another. Preferably, such a technique could be used with testing images stored in web sites against stored image files and should be extremely difficult, if not essentially impossible, for a third party to circumvent.
Ideally, such a technique should provide a highly secure identifier, such as a watermark, for an image where the identifier could be embedded within the image itself and would be extremely difficult, if not effectively impossible, to remove or frustrate. Through such a technique, detecting whether an supposed replica is a copy of an image could occur by merely detecting whether the replica contains the particular watermark associated with the image or not. If the replica were to contain that watermark, then upon consulting a database of authorized users, an image owner could conclusively determine whether the replica is an authorized or illicit (i.e., xe2x80x9cpiratedxe2x80x9d) copy. Advantageously, such a technique would be particularly amenable to being automatically implemented, such as in a PC or workstation, thereby obviating a need for laborious manual image comparisons.
Our present invention advantageously satisfies this need by creating a highly secure watermark for an original (in the sense of being an xe2x80x9cinputxe2x80x9d) image, by transforming data, i.e., pixel values and specifically pixel intensity values, for that image into a series of transform coefficients; adding corresponding, though relatively small, but specifically determined pseudo-random perturbations to these coefficients, wherein all the perturbations collectively satisfy a plurality of mathematical constraints; and then creating a xe2x80x9cwatermarkedxe2x80x9d version of this image by applying a reverse transformation on the perturbed coefficients to yield resulting image data. The resulting image data, rather than original image data, is then used whenever that image is to be publicly disseminated, whether by distribution through a web server, by diskette or by any other insecure distributional vehicle. These perturbations collectively define the watermark.
Advantageously, the watermark, while being basically imperceptible to a viewer, is essentially, if not totally, impossible to remove from the image (i.e., for all intents and purposes, is xe2x80x9cindeliblexe2x80x9d) and hence is highly secure against tampering by a third party. Hence, any subsequent replica of the watermarked image will itself also contain the watermark. As such, these replicas can be automatically detected, without a need for human intervention, by simply analyzing whether a given (xe2x80x9ctestxe2x80x9d) image contains the watermark (i.e., it is an image replica) or not.
In accordance with our specific inventive teachings, to provide a watermark that is sufficiently secure against third-party tampering, and by so doing generate an xe2x80x9cenhancedxe2x80x9d watermark, the perturbation values are specifically chosen not to collectively satisfy just one mathematical constraint, but rather a plurality of different such constraints. In particular, K random subsets (S1, S2, . . . , SK) of the N integer values (Sj⊂R{1, 2, . . . , N}) are selected with, e.g., the particular perturbation values designated by each subset of integers multiplied by corresponding transform coefficients (i.e. products) sum to a predefined value, e.g., zero   (                    ∑                  j          ⁢                      xe2x80x83                    ∈                      xe2x80x83                    ⁢                      S            i                              ⁢              xe2x80x83            ⁢                        γ          j                ⁢                  M          j                      =    0    )
or a non-zero value xcex8i. The individual perturbation values, xcex3i, are preferably kept relatively small, e.g., xc2x11 for processing simplicity, to render essentially imperceptible any visually apparent artifacts in the watermarked image that might otherwise arise from use of larger perturbation values.
Thereafter, a corresponding perturbation value xcex3i is added to each transform coefficient, Mi (where i=1, 2, . . . , N), such that M1←M1+xcex31, M2←M2+xcex32 and so on. Hence, each transform coefficient Mi in the image to be watermarked is perturbed by a specific predefined amount to define a gross perturbation which constitutes the watermark.
To detect the enhanced watermark in a suspected image copy (i.e., test image, Ixe2x80x2), that image is first transformed into a corresponding set of transform coefficients. Selected coefficients are then collectively tested by utilizing a series of different mathematical tests, on different corresponding subsets of these selected coefficients, to detect whether at least a majority of the perturbation values are present in the test image.
In particular, the perturbation values, for each of the K subsets of the random integer values for the test image and associated with the watermark, are separately tested to determine whether a majority of the subsets of the pseudo-random perturbation values collectively exist in the test image. Specifically, for each such subset, an expression       Q    i    =            ∑              j        ⁢                  xe2x80x83                ∈                  xe2x80x83                ⁢                  S          i                      ⁢          xe2x80x83        ⁢                  γ        j            ⁢              r        j            
is computed,
where rj is the jth received transform coefficient for the test image and Si is the ith subset of randomly selected integer values. This expression is separately computed for each of the K subsets. If a majority of Qi values exceeds a predefined threshold, then the test image contains the enhanced watermark, xcex3, and hence is a copy;
otherwise, it is not. Alternatively, an xe2x80x9camplifiedxe2x80x9d approach may be used. Specifically, once each Qi value is determined, an expression Li=[C/Normalize(Qi)] is calculated, where C is an empirically determined constant. Then, given all the Li values, a conventional majority type rule can be used to determine if the enhanced watermark, xcex3, is present or not in the test image.
As a feature of our invention, the difficulty of effectively jamming the watermark (i.e., successfully altering it) significantly increases, as hence the security provided by the watermark, as the number (K) of subsets increases. This number can be set to a value that imparts a desired level of security to the watermark consistent with computational resources then available to process each test image in order to detect the watermark.