The present invention relates generally to image processing and information encoding and, more particularly, to encoding information within printed images or text using differences in gray or color levels which are imperceptible to the human eye.
Steganography is the art and science of communicating in a way which hides the existence of the communication. In contrast to cryptography which actually encrypts or encodes a message to hide its meaning, the goal of Steganography is to hide a second message within a first, otherwise harmless message.
The word Steganography literally means covered writing as derived from Greek. It includes a vast array of methods and variations that have been used throughout history to conceal information and the very existence of a message. For example, drawings have often been used to conceal or reveal information. It is simple to encode a message by varying lines, colors or other elements in pictures. With the advent of the computer, the electronic printer and the ability to process and manipulate images and data, such methods have been taken to new dimensions.
Plain paper has long been a favored recording medium for storing and transferring human readable information. In fact, it has recently been said that paper is one of the most promising media types for new computer applications. Even given the emergence of digital-based electronic communications, such as the world wide web, paper-based communication has kept pace with digital information. Electronic document processing systems have enhanced the functional utility of plain paper and other types of hardcopy documents by enabling the application of machine readable digital data thereon. This machine readable data enables the hardcopy document to actively interact with such a document processing system in a variety of different ways when the document is scanned into the system by an ordinary input scanner. See, for example, the copending U.S. patent application of Paul Jeran and Terry Mahoney (identified above) entitled xe2x80x9cMethods of Document Management and Automated Document Tracking, and a Document Management System.xe2x80x9d Jeran et al discloses a document management system wherein a printing device is configured to print text on a document as well as to automatically print machine-readable code on the document. The document management system also includes a scanning device configured to scan documents and extract at least some information from the machine-readable code, the information thus extracted being used to manage or control the use, distribution or the like of the document.
As a general rule, digital data is recorded by writing two-dimensional marks on a recording medium in accordance with a pattern which encodes the data either by the presence or absence of marks at a sequence of spatial locations or by the presence or absence of mark-related transitions at such locations. When the recording medium is paper, the writing is accomplished by a printing device resulting in printed text or other images on the surface of the paper which visually communicates the information to the user.
It is known to embed machine-readable markings on emulsion films, photographic papers and the like to provide some control and management of photographs and the like produced on those media. See, for example, U.S. Pat. No. 5,822,436 granted to Geoffrey B. Rhoads on Oct. 13, 1998 and entitled, xe2x80x9cPhotographic Products and Methods Employing Embedded Information.xe2x80x9d More recently, steganographic software for personal computers and workstations has become available. Such software enables information to be hidden in graphic, sound and apparently xe2x80x9cblankxe2x80x9d media. For example, in a 256 color image, each primary color is represented by 1 byte (8 bits). Information can be stored in the least significant bit of each byte without changing the appearance of the original image to the human eye.
Prior art methods for encoding machine-readable information on a paper document include providing markings on the document comprising a plurality of cells wherein the information is encoded by mapping binary data to differing gray scale levels. While this method is effective in providing machine-readable information, the presence, if not the meaning, of the information is typically discernable to the user and it requires the use of at least some surface area which could otherwise be used for text or other images. It would therefore be desirable to develop methods of providing machine- readable information which is not visually perceptible to the human eye and does not require the use of additional media surface area.
In a preferred embodiment, the present invention modifies previously printed marks, such as text, providing slight variations in the darkness of a printed mark (or the lightness of an unprinted area) to encode information. These variations are so slight as to be imperceptible to the human eye, yet are easily detectable by a machine such as a scanner. Furthermore, the information is encoded in existing, printed marks thus avoiding the use of any additional surface area of the media.
The present invention may be implemented as a method of encoding information on a page of displayed text wherein a characteristic of the text, or of the blank spaces separating individual characters of text, is modified. The method includes generating a set of data representing the information desired to be encoded and utilizing the generated data to modify the page data at selected encoding locations on the page of text. The modified page data is then utilized to display a page of modified text.
In a preferred embodiment, the present invention is implemented as a method of encoding information within text printed on a page utilizing one or more intensity levels to modify the printed text. The method includes identifying allowable encoding locations on the page of text, preferably the allowable locations will be at positions of existing characters. A first set of data representing intensity level values corresponding to the text at the encoding locations is generated. The intensity levels preferably correspond to gray scale levels at each pixel or group of pixels (cell) expressed as a digital signal. For example, a binary pixel (black or no black, i.e., a blank space) representing the text at an allowable encoding location is converted to a multi-bit gray level pixel image. A second set of data representing information to be encoded within the text is generated. The first and second sets of data are then summed or otherwise combined, preferably using an AND operation, to generate a third set of data representing modified intensity levels corresponding to the text modified to include the encoded information. The third set of data is then utilized to print the modified text on a page.
Other aspects and advantages of the present invention will become apparent from the following detailed description, taken in conjunction with the accompanying drawings, illustrating by way of example the principles of the invention.