Machine-readable symbols provide a means for encoding information in a compact printed form (or embossed form) which can be scanned and then interpreted by an optical-based symbol detector. Such machine readable symbols are often attached to (or impressed upon) product packaging, food products, general consumer items, machine parts, equipment, and other manufactured items for purposes of machine-based identification and tracking.
One exemplary type of machine-readable symbol is a bar code that employs a series of bars and white spaces vertically oriented along a single row. Groups of bars and spaces correspond to a codeword. The codeword is associated with an alpha-numeric symbol, one or more numeric digits, or other symbol functionality.
To facilitate encoding of greater amounts of information into a single machine-readable symbol, two-dimensional bar codes have been devised. These are also commonly referred to as stacked, matrix and/or area bar codes. Examples of such two-dimensional symbologies include Data Matrix, Code One, PDF-417, MaxiCode, QR Code, and Aztec Code. 2D matrix symbologies employ arrangements of regular polygon-shaped cells (also called elements or modules) where the center to center distance of adjacent elements is uniform. Typically, the polygon-shaped cells are squares. The specific arrangement of the cells in 2D matrix symbologies represents data characters and/or symbology functions.
As an example of a 2D matrix symbol technology, a Data Matrix code is a two-dimensional matrix barcode consisting of high-contrast “cells” (typically black and white cells) or modules arranged in either a square or rectangular pattern. The information to be encoded can be text or numeric data, or control symbols. The usual data size ranges from a few bytes up to 1556 bytes. Specific, designated, standardized groups of cells—typically eight cells—are each referred to as a “symbol character.” The symbol characters have values which are referred to as “codewords.” With a black cell interpreted as a 0 (zero) and a white cell interpreted as a 1 (one), an eight-cell codeword can code for numbers 0 through 255; in turn, these numeric values can be associated with alphanumeric symbols through standard codes such as ASCII, EBCDIC, or variations thereon, or other functionality.
The codewords—that is, the designated groups of cells in a symbol—have specific, standardized positions within the overall symbol. The interpretation of a symbol in a given context (for example, for a given manufacturer and/or a given product) therefore depends on the codewords within the symbol; and in particular, the interpretation depends on both: (i) the contents of each codeword (that is, the pattern of cells in each codeword), and (ii) the placement or position of each codeword in the symbol.
Typically, for sequential alphanumeric data (for example, a product identification number or a street address), each sequential data character is assigned to the symbols of a codeword in a standardized order. For example, the order may be left-to-right along the rows of the symbol, or according to a standardized diagonal pattern of placement. Because the codewords have specific, standards-specified placements within a symbol—and because no information about the placement is contained in the symbol character—the symbols may also be referred to as “matrix symbols” or “matrix symbology barcodes.”
Bar code readers are employed to read the matrix symbols using a variety of optical scanning electronics and methods. Ideally, the machine-readable symbols which are scanned by a bar code reader are in perfect condition, with all of the cells of consistent, uniform size; each cell being fully filled with either total black or total white; and the contrast between black and white cells being 100%.
In real, practical application the machine-readable symbols which are scanned by a bar code reader may be imperfect. They may be smudged by external substances (grease, dirt, or other chemicals in the environment); or the surface on which the symbols were printed may be stretched, compressed, or torn; or the printing process itself may be flawed (for example, due to low ink levels in a printer, clogged printheads, etc.). The defects in actual symbols may introduce errors in the machine reading process.
To address these practical problems, error correction techniques are often used to increase reliability: even if one or more cells are damaged so as to make a codeword unreadable, the unreadable codeword can be recovered through the error-correction process, and the overall message of the symbol can still be read.
For example, machine-readable symbols based on the Data Matrix ECC 200 standard employ Reed-Solomon codes for error and erasure recovery. ECC 200 allows the routine reconstruction of the entire encoded data string when the symbol has sustained 25% damage (assuming the matrix can still be accurately located).
Under this standard, approximately half the codewords in a symbol are used directly for the data to be represented, and approximately half the codewords are used for error correction. The error-correction (EC) symbols are calculated using a mathematical tool know as the Reed-Solomon algorithm. The codewords for the symbol are the input to the Reed-Solomon algorithm, and the error-correction (EC) symbols are the output of the Reed-Solomon algorithm. The complete machine-readable symbol includes both the data codewords and the EC codewords.
For a given symbol format (such as Data Matrix, PDF-417, QR-Code, Aztec Code, and others), and for a given size of the symbol matrix, there are a fixed, designated numbers of EC codewords. To recover any one, particular damaged (unreadable) codeword, two things must be recovered: (i) the location of the damaged data codeword within the symbol, and (ii) the contents (the bit pattern) of the damaged data codeword. In turn, to recover both the location and the bit pattern for a single codeword requires two of the available EC symbols. It follows that if a machine-readable symbol has two damaged codewords, four EC codewords are required to recover the full symbol. Generally, if a symbol has “N” damaged codewords, then 2*N EC codewords are required to recover the full symbol.
The number of EC codewords in a symbol is limited. This places a limit on the number of damaged, unreadable data codewords which can be recovered. Generally with error correction techniques, and using present methods, the number of damaged data codewords which can be recovered is half the total number of EC codewords. For example, in a Data Matrix symbol with 16×16 cells, the total number of EC codewords is 12. This means that at most 6 damaged data codewords can be recovered. If more than 6 of the data codewords are damaged, the complete symbol may be unreadable.
However, if the location of the data codeword in error is already known, then only one EC codeword is needed to correct the error. This technique is called “erasure decoding”. Unfortunately, in Matrix Code symbols generally, the location of the errors is not known.
Therefore, there exists a need for a system and method for recovering more damaged data codewords in a symbol than may be recovered based on only the error-correcting symbols by themselves. More particularly, what is needed is a system and method for determining the location of a damaged or erroneous data codeword, independent of the information stored in the EC codewords.