The present invention relates generally to a processor-based technique in the field of information decoding, and, more particularly, to a technique for decoding digital data that has been encoded in an acquired color image in image regions that have patterns of color modulated subregions within them. The data may be encoded so that each image region has an overall visual appearance of an average color to the human viewer while the multi-colored subregions are substantially imperceptible, or not visually objectionable, and are simultaneously capable of detection by a digital image capture device for decoding purposes. The decoding operation does not require the original, unencoded color image for decoding.
Bar codes are a well-known category of document or image marking techniques that have as their primary goal to densely encode digital information in a small image space without regard to how visible the encoded information is to a human viewer, and with the intent to reliable decode the information at a later time. Bar code images are typically attached to other objects and carry identifying information. U.S. Pat. No. 4,443,694, entitled "Multilevel Bar Code Reader" discloses a bar code reader for decoding a bar code using at least three levels of darkness. The bar code that encodes data consists of a plurality of bars, each of which has a particular level of darkness. The sequence of bars encodes a particular data string in a printed format. It is disclosed that in a particular embodiment of the invention, the transition from one darkness level to another second darkness level is indicative of the encoding of a predetermined value of a binary string. Each transition from bar to bar is translated into its appropriate dual set of bit strings to divulge the final binary string. In this embodiment five levels of darkness are utilized in the bar code, with each level having associated with it a certain degree of darkness including white, white-gray, gray, gray-black, and black.
U.S. Pat. No. 5,619,026, entitled "Grayscale Barcode Reading Apparatus System Including Translating Device for Translating a Pattern Image into a Sequence of Bar Widths and Transition Directions," discloses a system for verifying an object of interest that includes a grayscale one-dimensional bar pattern coupled to the object. The grayscale pattern includes vertical stripes of varying brightness and width, and is disclosed as being a hidden pattern. It is disclosed that the use of grayscale bar codes differs from standard practice which uses binary patterns. Decoding relies on detecting distinct transitions between gray scales at the edges of the bars.
Two-dimensional (2D) bar codes encode data in both the height and width of an encoded bar code image, and so store considerably more information than a linear, one-dimensional (1D) bar code. It is estimated that over a thousand alphanumeric characters can be placed in a single 2D bar code symbol the size of a large postage stamp. 2D bar codes typically have one of two designs: a stacked, or multi-row linear bar code, and a matrix or dot type bar code. The matrix type of 2D bar code is usually square and made up of a grid of small square cells which can be black or white. PDF417 is an example of a stacked 2D bar code. PDF417 has a very high data capacity and density: each symbol can code up to 2725 bytes of data at a density of about 300-500 bytes per square inch. DataMatrix is an example of a 2D matrix-type bar code that contains up to 2,335 characters per symbol. Symbols typically have 400-500 characters per square inch. Maxicode is an example of a dot-type 2D bar code that uses 888 data-carrying circular cells arranged around a bullseye; approximately 100 alphanumeric characters can be encoded in a square inch. Additional information on 2D bar codes may be found, for example, in an article by Johnston and Yap entitled "Two Dimensional Bar Code as a Medium for Electronic Data Interchange," Monash University (Clayton, Victoria) available as of the date of filing at http://www.bs.monash.edu.au/staff/johno/BARCOPAW.html.
In an article entitled "A Flexibly Configurable 2D Bar Code", available as of the date of filing at http://www.paperdisk.com/ibippapr.htm, Antognini and Antognini disclose a 2D symbol technology called PaperDisk.TM. that represents data by means of what is termed a "spot" or "cell". A spot is a typically rectangular array of dots, or printed pixels, laid down by a printer to represent a bit being "on". It is separated from adjoining spots (or places they might occupy) by designated vertical and horizontal distances. These distances are measured in terms of (typically) integral numbers of dots. A cell is a region allocated to a given potential spot. That is, it includes the spot itself (where the bit value calls for a spot) and extends halfway to the edges of neighboring potential spots. Clocking features, called "markers" are rectangular arrays of dots arranged in vertical strips throughout a pattern. All encoded data plus landmarks and meta-information about the encoded information are collectively referred to as a data tile. Decoding proceeds by first finding a landmark, from which a preliminary estimate can be made of the scale and orientation of the features in the image, with the goal of finding the meta-information. When the meta-information is found it is decoded to produce data format parameter values for the data portion that follows. FIG. 2 illustrates a full data tile as a black and white image of a large number of small, rectangular dark marks. It would appear, then, from the disclosure that the PaperDisk.TM. technology is intended to produce an encoded image in which the encoded data is visible to a human viewer.
There are also a collection of document or image marking techniques that have as their primary goal to embed encoded information in an image so that it is substantially imperceptible to a human viewer, in a manner that simultaneously minimizes image distortion caused by embedding the information and permits reliable decoding of the information at a later time in the image life cycle. These techniques, which may be relevant to an embodiment of the present invention, often have design goals that can be generally categorized in terms of three main factors: how much data (i.e., the density) is encoded in the image; how robust the encoded data is to image manipulation such as printing, scanning, rotation, scaling and compression; and how much perceptible change is produced in an original image by adding the encoded data. The intended purpose or function of the encoded data in an image generally determines which one or combination of the three factors is the most important goal of a data encoding technique, and necessarily influences the design and technical details of the technique used. Another factor that is also sometimes taken into consideration when designing an image marking technique is whether the image to which data is to be encoded is a text, or document, image, or a graphic or photographic image.
A particularly well-known area of image marking is known as digital watermarking, which is typically applied to a graphic or photographic image. A successful digital watermarking technique is concerned with the factors of robustness and minimizing image changes, and so is designed to simultaneously produce an embedded signal that is imperceptible to a human viewer so as not to diminish the commercial quality and value of the image being watermarked, while also producing an embedded signal that is resistant to tampering, since removal of the embedded signal defeats the identification purpose of watermarking. A successful watermarking technique is typically designed so that attempts to remove the embedded signal cause degradation of the image sufficient to render it commercially less valuable or worthless. Because the factors of minimizing image change and encoded data robustness are so crucial to successful digital watermarking techniques, the goal of achieving a high data density rate is typically sacrificed in these techniques.
PCT International Application WO 95/14289 discloses a signal encoding technique in which an identification code signal is impressed on a carrier to be identified (such as an electronic data signal or a physical medium) in a manner that permits the identification signal later to be discerned and the carrier thereby identified. The method and apparatus are characterized by robustness despite degradation of the encoded carrier, and by holographic permeation of the identification signal throughout the carrier. The embedding of an imperceptible identification code throughout a source signal is achieved by modulating the source signal with a small noise signal in a coded fashion; bits of a binary identification code are referenced, one at a time, to control modulation of the source signal with the noise signal. In a disclosed preferred embodiment, an N-bit identification word is embedded in an original image by generating N independent random encoding images for each bit of the N-bit identification word, applying a mid-spatial-frequency filter to each independent random encoding image to remove the lower and higher frequencies, and adding all of the filtered random images together that have a "1" in their corresponding bit value of the n-bit identification word; the resulting image is the composite embedded signal. The composite embedded signal is then added to the original image using a formula (Equations 2 and 3) that is based on the square root of the innate brightness value of a pixel. Varying certain empirical parameters in the formula allows for visual experimentation in adding the composite identification signal to the original image to achieve a resulting marked image, which includes the composite identification signal as added noise, that is acceptably close to the original image in an aesthetic sense.
Cox, Kilian, Leighton and Shamoon, in NEC Research Institute Technical Report No. 95-10 entitled "Secure Spread Spectrum Watermarking for Multimedia," disclose a frequency domain digital watermarking technique for use in audio, image, video and multimedia data which views the frequency domain of the data (image or sound) signal to be watermarked as a communication channel, and correspondingly, views the watermark as a signal that is transmitted through it. In particular with respect to watermarking an N.times.N black and white image, the technique first computes the N.times.N DCT of the image to be watermarked; then a perceptual mask is computed that highlights the perceptually significant regions in the spectrum that can support the watermark without affecting perceptual fidelity. Each coefficient in the frequency domain has a perceptual capacity defined as a quantity of additional information that can be added without any (or with minimal) impact to the perceptual fidelity of the data. The watermark is placed into the n highest magnitude coefficients of the transform matrix excluding the DC component. For most images, these coefficients will be the ones corresponding to the low frequencies. The precise magnitude of the added watermark signal is controlled by one or more scaling parameters that appear to be empirically determined. Cox et. al note that to determine the perceptual capacity of each frequency, one can use models for the appropriate perceptual system or simple experimentation, and that further refinement of the method would identify the perceptually significant components based on an analysis of the image and the human perceptual system. Cox et. al also provide what appears to be a detailed survey of previous work in digital watermarking.
U.S. Pat. No. 5,369,261, entitled "Multi-color Information Encoding System," discloses an exceptionally dense information encoding system that employs colored areas in the forms of bars or checkerboard matrices of colored dot regions to encode information, with each colored region being variable as to both color and intensity. In one embodiment, "super-pixel" dots have differently colored sub-regions within them, arranged with side-by-side colors, or with colored regions stacked one on top of the other, such that information from one dot has as many color variables as there are stacked layers or mixed colors. For each color there are as many as 64 intensities yielding a coding system of high information density. For decoding purposes, the various colors are read out at one super pixel dot position by dividing out reflected or transmitted energy from a dot by color filtering such that a color and intensity can be detected for each color intensity within the super pixel dot. The code provided by this invention is substantially invisible to the naked eye.
Data glyph technology is a category of embedded encoded information that is particularly advantageous for use in image applications that require a high density rate of embedded data and require the embedded data to be robust for decoding purposes. However, data glyph encoding produces perceptible image changes which may be able to be minimized so as to be inconspicuous, or even surreptitious, in particular types of images. Data glyph technology encodes digital information in the form of binary 1's and 0's that are then rendered in the form of distinguishable shaped marks such as very small linear marks. Generally, each small mark represents a digit of binary data; whether the particular digit is a digital 1 or 0 depends on the linear orientation of the particular mark. For example, in one embodiment, marks that are oriented from top left to bottom right may represent a 0, while marks oriented from bottom left to top right may represent a 1. The individual marks are of such a small size relative to the maximum resolution of a black and white printing device so as to produce an overall visual effect to a casual observer of a uniformly gray halftone area when a large number of such marks are printed together in a black and white image on paper; when incorporated in an image border or graphic, this uniformly gray halftone area does not explicitly suggest that embedded data is present in the document. A viewer of the image could perhaps detect by very close scrutiny that the small dots forming the gray halftone area are a series of small marks that together bear binary information. The uniformly gray halftone area may already be an element of the image, or it may be added to the image in the form of a border, a logo, or some other image element suitable to the nature of the document.
Examples of U.S. Patents on data glyph technology are U.S. Pat. Nos. 5,221,833, 5,245,165, and 5,315,098. U.S. Pat. No. 5,221,833, entitled "Methods and Means for Reducing Error Rates in Reading Self-Clocking Glyph Codes", discloses a method for encoding n-bit long multi-bit digital values in a pre-ordered cyclical sequence based on their analytically or empirically determined probabilities of being confused with each other, such that each glyph is adjacent in that sequence to the two glyphs with which it is more likely to be confused during decoding. U.S. Pat. No. 5,245,165, entitled "Self-Clocking Glyph Code for Encoding Dual Bit Digital Values Robustly", discloses a method for encoding dual bit digital values in the cardinal rotations (0.degree., 90.degree., 180.degree. and 270.degree.) of a logically ordered sequence of wedge-shaped glyphs (essentially right triangles) that are written, printed or otherwise recorded on a hardcopy recording medium with a predetermined spatial formatting rule. The widths of the glyphs vary unidirectionally as a function of their height, so they can be decoded reliably, even when they are degraded by scan errors, dropped scan lines and/or random noise patterns. U.S. Pat. No. 5,315,098, entitled "Methods and Means for Embedding Machine Readable Digital Data in Halftone Images," discloses techniques for encoding digital data in the angular orientation of circularly asymmetric halftone dot patterns that are written into the halftone cells of digital halftone images.
Commonly assigned U.S. Pat. No. 5,684,885, entitled "Binary Glyph Codes Based on Color Relationships," (hereafter, the '885 patent") discloses a technique that may be used to encode information in a color image. The technique renders binary digital data on a surface, such as a sheet of paper, by printing a series of color patches on the sheet, with the 1 bits rendered as color patches of a first color and the 0 bits rendered as color patches of the second color. The color patches are arranged in a predetermined order along an axis on the surface. The second color relates to the first color by a fixed relationship in color space. In an illustrated embodiment, the first and second colors are a scalar .alpha. distance away from an average color along a vector v.sub.0 in color space. A relatively large area of intermingled color patches for the first color and the second color will optically combine to appear, from a distance to be a single third color. The color patches can be intermixed with areas of a third color, the third color representing an average in color space of the first color and the second color. When these color patches of two different colors are imperceptible to a human, the information they represent becomes invisibly encoded in the image. In the illustrated embodiments in the '521 application, it is noted that, in the choice of orientation of the vector v.sub.0 and the extent of the scalar .alpha. that are used to determine the two colors that are used to produce the color patches, it is desirable to balance the accuracy and sensitivity of the marking device (e.g., a printer) and the digital image capture device (e.g., a scanner) being used with the sensitivity of the human eye. It is desirable to have the deviation between the two colors to be maximally detectable by a scanner and minimally detectable by the human eye. When an average color of the two colors is rendered on the page and is visible to the human eye, the average color should be deemed merely the carrier of information, and the color deviations of neighboring color patches being a modulation of the carrier.
The '885 patent further proposes, at col. 5, that the information encoding technique therein be used in conjunction with the data glyph technology described above. Each individual color patch could be printed in the form of a geometric data glyph in which the shape or linear orientation of each glyph determines whether it is assigned a 1 or 0 value, and the different shading of each glyph, representing one of two different colors, determines whether the color is assigned a 1 or 0 value. The relative colors of neighboring glyphs are independent of the orientations of neighboring glyphs, and so the geometry-based glyphs can be used to render one set of data, while the relative colors of neighboring glyphs render another set of digital data.
Some types of 2D bar code technology encode data at a high density rate but none are intended to produce encoded data that is substantially imperceptible in an encoded image. Data glyph technology, which also supports a high data density encoding rate, is also not designed to produce encoded data that is substantially imperceptible in an encoded image, although data glyphs may happen to be very unobtrusive in an encoded image as a result of where they are placed. The technology disclosed in the '885 patent requires that the differently colored patches produce an average color that effectively hides them from view; in order to decode the message value in a color patch of a first color, it is necessary to determine the second color used to encode a different data value, and also to determine the average color of the image region in which data is encoded in order to establish the color space relationship between the two colors.