This invention relates to steganography. More particularly, this invention relates to techniques for embedding a mark in a still image so that the bit string is detectable by a printing system.
The appearance of commercial color photocopiers in the 1970""s presented counterfeiters around the world with a powerful, widely accessible tool for creating passable reproductions of currency and other security documents such as treasury bills and airline tickets.
In the United States, this problem has been addressed with respect to currency counterfeiting through laws and import restrictions with the result that most color photocopiers have a circuit that applies simple feature recognition techniques to the image being photocopied to detect when a bill is being reproduced, and refuses to complete the task. The complexity of this task, its specificity to the features of a single type of bill, and the variations among different denominations all make this circuit easy to circumvent.
The proliferation of inexpensive color scanning and printing technology for personal computers in recent years has presented treasury departments with a new challenge. For example, an inexpensive system including a 720xc3x97720 DPI color ink-jet printer with a 300 DPI flatbed scanner can be used to create color reproductions that exceed the quality of color photocopiers costing more than a hundred times as much.
This development has brought about a need to enable a printing system including an ink-jet printer to discern when it is printing a security document. A requirement of any data embedded in the document to this end would be that it not adversely affect image quality. At the same time, the data should be decodable without extensive or expensive computational resources, since the goal would be ultimately to integrate the decoder into the printer itself Also, for analysis, the bits should be detectable after digitizing by a flatbed scanner of typical consumer resolution, currently 600 DPI or less.
Enabling such an ink-jet printer to determine when it is processing a scanned version of a security document in a manner that would allow it to refuse to reproduce the document differs fundamentally from the analogous problem for a photocopier. An ink-jet printer handles data for only a small number of lines, corresponding to one or two traverses of the printing head, at one time. A consumer ink-jet typically prints a quarter-inch band across an 8.5-inch path length in a single pass. Ideally, any technique should require image data from only one pass at a time.
Furthermore, the scan-print sequence used in falsifying documents with these consumer devices subjects the document to be reproduced to nonlinear modifications not necessarily introduced by photocopying. Such modifications are first introduced into a document to be reproduced during the creation of its RGB representation during scanning. The resulting scanned image is characterized by one resolution and, generally, some translation and rotation with respect to the origin of the print field. This digitized image may then be intentionally modified by a counterfeiter intending to obscure any embedded marking using specialized software. Finally, further nonlinear modifications are introduced during printing by an ink-jet printer in the form of spatial resolution lost to dithering in order to enhance the color depth obtainable from the four to seven ink colors in its palette.
In terms of data hiding, this situation differs from the traditional information hiding problems. Typically for images, data hiding techniques are designed with the understanding that the quality of a test image might be largely degraded compared to the original unaltered host image in terms of signal-to-noise ratio through perceptual coding methods such as JPEG; that arbitrary resampling might have been done through scaling; is and that cropping is a possibility. Most commercial systems also presuppose that a test image presented to the decoder has not been rotated with respect to the host image; often such systems require the test image to be untranslated as well. Furthermore, it is often assumed that the test image will be in a similar color/luminance spacexe2x80x94RGB v. CMYK, for examplexe2x80x94as the original host image.
By contrast, data hiding for preventing and detecting counterfeiting of security documents is constrained by an almost complementary set of circumstances. An offender is motivated to create a reproduction that looks as much as possible like a legitimate document before trying to pass it. Thus the quality of the reproduced image that would serve as a test image is usually excellent; the size and scale of the reproduction is fixed. On the other hand, there is no reason, from the point of view of a forger, not to print out a falsified document oriented 45 degrees from the paper""s edges or at some arbitrary position on the page, especially if such a simple alteration will allow a fraud to escape detection by the printer.
The invention embeds a mark in a host image in a manner that allows its interpretation by a printing system that operates by processing image data in subsegments corresponding to less than the entire image, such as an ink-jet printer. Specifically, values of a characteristic parameter, such as luminance and/or chrominance, are altered in a portion of the host image confined to a thread, i.e. a region of contiguous points in the image, small enough to be included in the print space treated by the printer in a single pass of the printing head. (Note that as used herein, the term xe2x80x9cpassxe2x80x9d refers to the movement of the print head involved in printing one continuous band or region of the image, across the image, or a fraction thereof, even if the print head technically makes more than one traverse over this area, such as may occur with some interleaving techniques.) This configuration allows an inexpensive printer, for example, to be programmed to determine whether a specific mark has been encoded in a test image. Thus it can refuse or continue to print the image accordingly, without having specifically to recognize the document (for example, as a $20 bill) or its class (for example, as United States currency).
Preferably, the encoding is repeated in several threads in the image, in varying orientations, thereby minimizing the probability that detection of the mark will be circumvented simply by changing the orientation at which the bill is scanned or printed. The number of repetitions and their orientations necessary to maintain the integrity of the system depends on the geometry of the image, the width of the threads and the width of the printhead.
The invention is not limited to any particular encoding algorithm or internal thread substructure. Space-domain, spread-spectrum techniques, well known in the art, and the statistical approach (xe2x80x9cPatchworkxe2x80x9d) outlined in U.S. Pat. No. 5,689,587, herein incorporated by reference, are two types of methods useful for documents such as are the targets of counterfeiters, owing to the lack of resealing anticipated during illicit reproduction of these documents. However, virtually any technique compatible with the reduced accuracy of encodingxe2x80x94resulting from the small encoding area for an individual bitxe2x80x94can be used. In particular, the technique should return all possible bit values with equal probability when analyzing an unencoded region. The properties of a host document will influence the optimum encoding algorithm for a given document.
For example, the engraving on a bill of United States currency effectively camouflages alterations introduced by spread-spectrum types of encoding techniques. Thus, in a preferred embodiment, a thread is an elongated area of the host image subdivided into several regions, in each one of which a single bit is encoded by altering characteristic parameter values using conventional one-dimensional direct-sequence spread-spectrum techniques, as are well known in the art. Such techniques incorporate the data in the pixel domain as a modulation on a carrier function which is also multiplied by a pseudo-random series and then added to the host image pixel parameter values. For example, in one technique of this type, the carrier function may be phase modulated, the value of the phase shift indicating the bit value.
On the other hand, Patchwork may be used to embed a mark comprised of several bits by alterations distributed throughout a thread having no internal microstructure, due to the orthogonality of bits embedded using Patchwork; or, regions, each encoding several bits, may be defined in the thread. In this case, the embedding is done by first randomly selecting a large number of locations in the thread, for example by associating locations in the thread with members of a pseudo-random number series. A subset of locations is allotted for each bit to be embedded, and the locations in each subset are partitioned into first and second groups. Then to encode one bit value, the host image is altered by increasing the values of the characteristic parameter at locations belonging to the first group and decreasing the values of the same parameter at locations belonging to the second group; to encode the other bit value, the first group parameter values are decreased and the second group parameter values are increased. The increment by which the parameter value at any location in the subset is altered may be adapted to minimize the visibility of the encoding; for example, alteration at some locations may be waived, effectively receiving an encoding depth of zero.
Decoding entails determining whether or not a test image includes the embedded mark. A test area is defined in the test image, either by mapping onto the test image a domain having the same dimensions and, under ideal conditions, orientation as the thread defined in the encoded host image. Or, the test area is simply defined by the print line of a printer handling the test image. Sections corresponding to any regions identified in the host image during the embedding process may also be delineated within the test area. The parameter values of locations in the thread are processed so as to generate data which can be interpreted as a certainty level that the mark has been embedded. For example, to identify a mark embedded using a direct-sequence spread-spectrum technique, in each section the parameter values are multiplied by corresponding pseudo-random values, the carrier modulation is identified and the section is accordingly assigned a bit value and confidence level.
To read a mark embedded using a Patchwork technique, the selection, allotment and partition of locations generated during the embedding process is recreated in the test image, for example, by supplying a key specific to the bit string to a pseudo-random number generator and then applying the allotment and partition procedures. The decoder then calculates for each bit an experimental value of a test statistic, formulated to reflect the alterations to the thread associated with the statistic, of the parameter values assessed at the allotted locations in the test image thread. Generally, the test statistic is equivalent to a linear combination of many instances of respective functions of the parameter values of locations belonging to the first and second groups, for example, the difference between the sums of the parameter values over the first and second group locations. For each bit, the experimental value of the test statistic is interpreted in terms of whether it indicates operation of the probability distribution function associated with one bit value or with the other.
In both cases, the resulting bit string in the test image is compared to the bit string known to be encoded in the host image. A binomial point distribution function can be calculated to indicate the overall likelihood that the test image has been embedded with the mark of interest. The decoder refuses to print or to continue printing the test image if the likelihood exceeds some predetermined threshold. The likelihood of encoding may be calculated after the entire thread has been decoded, or, preferably, the likelihood is determined periodically as decoding progresses based on the decoded portion of the bit string and its confidence level.
Although the mark embedded by the invention is generally characterized herein as a bit string, the invention is not limited to this mode. Even if the mark includes a string of several bits, for the purposes of this document, decoding or responding to the mark may in practice entail confirming the value of only one bit of the string, if the identification can be made to a sufficiently high certainty. If at any point the decoder has accumulated enough evidence of encoding of the mark to satisfy some predeterminal certainty criterion, this process may be terminated, even if less than all of the available data associated with a single bit has been processed.
The invention is not limited to digital images. In addition to embedding by directly altering pixel values in a host image in electronic format, which can then be printed, the invention allows for the encoding to be independently generated and superposed onto an existing hardcopy document. A printing system configured to prevent printing of an encoded document may be controlled by a computer, cooperating with a printer containing the decoder; or the decoder may be integral to the printer. Or, the encoding may be incorporated into other methods of creating documents. For example, an engraving plate used in the production of a bill of United States currency could be fashioned so as to impose the parameter alterations embedding the mark. The mark would be detected during ink-jet printing of an illegitimate copy after scanning of the engraved original.
Thus, the invention provides methods for embedding and decoding marks in images to be printed, particularly suited for preventing and detecting counterfeiting of currency and other security documents (for example, treasury bills, stock certificates, bearer bonds) and identification documents (such as birth certificates, driver""s licenses, passports, social security cards). In related aspects, the invention also provides an apparatus for embedding a mark in an image according to the method; an apparatus for determining whether a test image to be printed contains a mark embedded according to the method; an image created by embedding a mark in a host image according to the method; and a printing system for processing data representing a test image and optionally printing the test image according to whether the data contains a mark of interest.