Watermarking is the process of embedding information into a signal without impacting its primary functionality. The signal may be audio, pictures or video, for example. Image watermarks are further classified as perceptible (visible to the naked eye in case of images) or imperceptible, and the particular choice is motivated by the application. See generally: Digital Watermarking, by: I. J. Cox, M. L. Miller, and J. A. Bloom, Morgan Kaufmann, (2001). Multimedia data hiding has been an active area of research with several emerging applications in content authentication, anti-piracy data hiding, and digital rights management.
Hardcopy watermarking or data hiding in images that is required to survive the print-scan (digital-analog-digital) process has come to receive significant attention recently. The fact that printed media comprises an increasing portion of official, legal and transactional data such as IDs, passports, contracts, and the like, makes the problem of considerable economic interest. Particular motivating applications lie in identifying where a hardcopy document originated, document integrity verification and copy protection for ensuring that copies of an original hardcopy are distinguishable and identifiable as such, copyright enforcement methods for identification of copyright ownership, and auxiliary data embedding such as web URLs, product UPC codes, etc. Hardcopy data hiding can be distinguished from other multimedia data hiding schemes by the presence of a print-scan process. Hardcopy data hiding schemes must adapt to the characteristics of the print-scan distortion channel.
One key component of the printing process is a bit-depth reduction step called digital halftoning which reduces the original continuous tone, typically 8 bits/pixel, image to a 1 bit/pixel binary image. Halftoning aims to produce an illusion of continuous tone by trading off amplitude resolution for spatial resolution. Algorithms for digital halftoning can be generally classified into three categories: 1) point processes, or screening methods; 2) neighborhood processes or error diffusion; and 3) iterative or search based methods. Of the above, screening (particularly clustered-dot) is dominant in xerographic printers, while error diffusion is used rather extensively in inkjet printers. Once printed, the image transitions to the analog domain, and can be re-digitized by scanning. For hardcopy data hiding, it should be noted that the input to the physical print-scan channel is a binary or halftone image and hence the distortions are tied to the nature of the halftoning algorithm employed.
Methods for data embedding in hardcopy images may be grouped into two categories. The first corresponds to robust embedding methods that are intended to survive printing and scanning but do not directly exploit characteristics of the printing process during embedding. The second corresponds to techniques that use the particular characteristics of the printing process (e.g. halftoning) for embedding. The exploitation of this specific knowledge typically offers greater potential for embedding and is better suited to hardcopy applications since the embedding occurs just prior to printing. The methods in the first category are limited by the lack of knowledge of the halftone print process. From a communications viewpoint, this adversely affects detection accuracy because the embedding does not adapt to the characteristics of the print-scan channel. Methods in the latter category require a manual visual detection/inspection process, e.g. by overlaying a pre-designed pattern directly over the hardcopy print. While this is useful for some applications such as document authentication, a large class of applications such as meta-data embedding, document tracking in workflows, and secure hiding in adversarial scenarios, are better enabled by automated data recovery. Methods which allow automated detection tend to perform embedding by modulating binary halftone outputs with pre-determined message signals to produce the watermarked image containing the embedded data. This can cause distortions in the output image even for modest embedding rate unless sufficient care is taken in the design of the embedding method.
Hardcopy data embedding techniques that allow automated extraction of the embedded data without involving a human observer are a key enabler for a wide variety of document security applications. Methods developed for this purpose can be categorized as data encoding or data hiding approaches. For methods in the former category, the data is embedded in a region of the printed page that is solely dedicated to the objective of conveying the information, the visual appearance of the encoded region being only a secondary concern. One and two-dimensional bar-codes are the predominant representatives of techniques in this class. Adaptations of these methods are also utilized for limited image rendering applications.
Methods for clustered-dot halftone pose significant challenges for data hiding methods. The requirement for a faithful reproduction of the original cover image can have irreconcilable conflicts with the ability to introduce a distinguishable change in the printed image for the purpose of embedding data. As a specific example, in regions where the cover image is white, the desired rendering is white (regions free of any printed dots), in which situation no discernible change can be incorporated. A similar conflict arises in regions where the cover image is black. In these scenarios, as is desirable for data hiding methods, embedding robustness must be relaxed in favor of cover image fidelity. Methods for data hiding in hardcopy image prints must carry the embedded data in a manner that minimally disrupts image content while still allowing data recovery at the decoding end. These can be challenging constraints.
Accordingly, what is needed in this art are increasingly sophisticated systems and methods for high capacity binarized message data embedding (and decoding) in a contone image which addresses the above-discussed problems in this art while achieving high embedding data rates.