A method of using glyphs on a page in a way that will avoid obstructions such as printed characters by providing one frame of data glyphs with instructions that will specify whether other specific frames are either valid or not valid.
A glyph is a diagonal line printed on paper that slopes at one angle to indicate one state of a bit, and at a different angle to indicate the other state. A frame of information is in numerical or word form, and there is no intended image. Glyphs are small, typically {fraction (1/60)}th of an inch and a printed 10 by 10 glyph frame, including its sync lines, appears as a gray square.
Numerous patents have issued on the generation and use of glyphs, such as U.S. Pat. No. 5,245,165, Self-Clocking Glyph Code for Encoding Dual Bit Digital Values Robustly; U.S. Pat. No. 5,449,895, Explicit Synchronization for Self-Clocking Glyph Codes; and U.S. Pat. No. 5,521,372, Framing Codes for Robust Synchronization and Addressing of Self-Clocking Glyph Codes, which are incorporated by reference herein.
A problem with an area of glyphs is that there may be overwritten characters, destroying the underlying glyphs. Of course, if a frame is completely destroyed, it can not be read and it will be disregarded by the system. In case a frame is in a position where it may be partially destroyed, the sync lines of that frame can contain information to tell the reader to disregard the frame even if it appears to be readable. The problem with this system is that the sync lines of a frame must be read before it can be determined whether it is valid. Thus, there is a need for a system where it may be determined beforehand if a frame may be obstructed.
When printing a page where there are known areas where the glyphs may be obstructed, one or more frame may be given the task of identifying the other frames in the area that may be obstructed, or those that are known not to be in areas of obstruction.
A frame is identified by one or more sync, or lattice, lines, and contains a block of data bits. If one or more of the frames is assigned the role of using its data, bits to store data identifying valid frames, then the other, possibly invalid, glyphs can be safely ignored. This information can be in the form of coordinates of a rectangle within which good (or bad) glyphs are printed, or for more complex shapes, a bit map can be used.
A number of inventions that allow embedded digital data to be written around known obstructions, in such a way that readback is effected with high tolerance for either random or localized image destruction, and without a priori knowledge of the obstructions, are described herein. These allow an application to embed the data as, for example, a background stipple around some printed marks (e.g., a company logo or a form title), and to recover the data by scanning, using a reader that has no information about the marks other than that which can be determined from the scanned image itself.
The module that starts with the input data and lays the glyphs out as an image (or as something that can be directly converted to an image) is called the xe2x80x9cwriterxe2x80x9d. The module that recovers the original data by analyzing the image is called the xe2x80x9creaderxe2x80x9d. The writer knows about the obstructions and defines valid and invalid regions. The data is written into the valid regions, and replicated into the invalid regions. This information (valid/invalid regions) is communicated to the reader using a bootstrapping sequence, that employs some or all of the following elements:
(1) data within the sync frame lines
(2) a Key Codeword of data (either fixed or variable length)
(3) data extensions to the Key Codeword
The Key Codeword and its use for encoding digital data around known obstructions is the primary feature of this invention. Use of a Key Codeword of data, when properly dispersed and protected by parity, is a robust method for acquiring meta information that is required by the reader for decoding the actual data.
Described herein are different types of data that one may wish to encode within the Key Codeword, the method by which the Key Codeword is dispersed for protection against local damage, methods for extending the size of the Key Codeword, some methods for encoding the valid/invalid regions (including by frame and by variable granularity), and the replication of valid data into invalid frames. We also give a detailed example of an extensible 4-bit encoding for valid/invalid regions of variable granularity. The ability to vary the granularity within an encoding both gives the encoding the flexibility to write glyphs around an arbitrary foreground logo and allows the encoding of the valid/invalid data itself to be efficient.
This section defines the terminology used throughout, and also gives a broad and simple overview of the encoding and decoding processes.
Edd: embedded digital data, a generic term that is also used to represent the set of marks in a particular instance.
Glyphs: the popular name for embedded digital data. xe2x80x9cGlyphxe2x80x9d is sometimes used to mean a single (1-bit) mark, and sometimes to refer to the entire set of such marks.
Frame: an mxc3x97n bit rectangular region containing data glyphs and sync lines.
Sync data: glyphs that are not data (and are not encoded)but are used to find the location and ordering of the glyphs.
Sync line: a line of sync (and possibly other) data.
Sync lattice: the lattice of sync lines. This is typically a rectangular lattice, where each frame of data is bounded on four sides by sync lines. Thus, this is a lattice of glyphs into which the glyph data is xe2x80x9cpouredxe2x80x9d.
Sync crossing: the location where two sync lines cross.
These are special reference points in the sync lattice.
Meta-data: glyphs that are used to describe how the glyph data is written. This may include encoding parameters (block size, parity bytes, crc bytes) as well as marking valid and invalid data frames.
V/IV: valid/invalid regions.
Logo: name given to foreground image data (e.g., text, graphics)that is superimposed (overlaid, underlaid) on the digital data. These marks cause the problem that is the subject of this IP.
Damaged frame: data frames containing sufficient logo to cause the writer to decide to invalidate the frame.
Ecc: error correction codingxe2x80x94the addition of parity bits to the data in order to identify and correct errors.
Parity: another name for the symbols added to the data to allow errors to be identified and corrected.
Erasures: symbols identified (e.g., by a weak signal) as being unreliable to read, and therefore designated specifically to be corrected using the parity symbols.
Symbol: for Reed Solomon block codes, the symbol is typically taken to be 8 bits. Damage to any number of bits within a symbol can be corrected with two symbols of parity (or one symbol if the damaged symbol can be identified as an xe2x80x9cerasurexe2x80x9d.)
Bit: each 45-degree glyph encodes, through its orientation, a bit of information, because we interpret the value of the glyph as 0 or 1 depending on the orientation. Hence, we often use xe2x80x9cbitxe2x80x9d interchangeably with xe2x80x9cglyphxe2x80x9d.
Data: the information from the user""s application that is actually encoded by the glyphs.
Key Codeword: a special set of glyphs that contain meta-data
Extended Codeword: an optional special set of glyphs that contain meta-data that cannot be fit into the Key Codeword.