Bar code symbologies are widely used for automated data collection. The first bar code symbologies developed, such as U.P.C., Code 39, Interleaved 2 of 5, and Code 93, can be referred to as "linear symbologies" because data in a given symbol is decoded along one axis or direction. Symbologies such as linear symbologies encode "data characters" (i.e., human readable characters) as "symbol characters," which are generally parallel arrangements of alternating, multiple-width strips of lower reflectivity or "bars" separated by strips of higher reflectivity or "spaces." Each unique pattern of bars and spaces within a predetermined width defines a particular symbol character, and thus a particular data character or characters. A given linear symbol encodes several data characters along its length as several groups of unique bar and space patterns.
As the data collection markets grew, symbologies required greater amounts of information to be encoded within a smaller area (i.e., greater "information density"). To increase the information density in linear symbologies, "multi-row" or "stacked" symbologies were developed, such as Code 49 and PDF417. Stacked symbologies generally employ several adjacent rows of multiple-width bars and spaces. To further improve information density, "area" or "two-dimensional matrix" ("2D matrix") symbologies, such as Data Matrix and Code One, were developed. 2D matrix symbologies employ arrangements of regular polygon-shaped cells where the center to center distance of adjacent elements is uniform. In a 2D matrix symbology, an "element" is a cell or a space, while in a linear or stacked symbology an element is a bar or a space. The specific arrangement of the elements in 2D matrix symbologies represent data characters and/or symbology functions. Elements in Code One, for example, are arranged as matrices of small, square data cells.
Linear, stacked and 2D matrix symbologies are generally based on their "X dimension," which is the nominal width dimension of the narrow bars, spaces or data cells in the symbology. "Nominal" refers to the intended value for a specified parameter, regardless of printing errors, etc. For Code One, the X dimension represents the smallest height (or width) of a data cell in a given symbol.
As shown in FIG. 1, a Code One symbol 20 having an 80 rail. X dimension includes a center finder pattern 22 surrounded by a matrix of square data cells. The Code One symbology includes several versions, where each version has a predetermined size and unique finder pattern 22. In most versions of the Code One symbology, several vertical reference patterns 24 extend perpendicularly from the finder pattern 22. The finder pattern 22 and vertical reference patterns 24 help automated data collection devices, or "readers," locate the symbol 20, determine its version, determine its tilt and orientation, and provide reference points from which to decode the matrix of data cells.
Each data cell in the matrix encodes one bit of data: a white data cell represents a 0 and a black data cell represents a 1. As shown in FIG. 2, each symbol character in the Code One symbology is generally constructed from eight data cells in a rectangular array of two rows that each have four data cells. Each set of eight data cells in a Code One symbol character encodes an 8-bit byte of binary data. The lower right-most data cell in this rectangular symbol character is the least significant bit and has a value of 2.sup.0 =1, while the top left-most data cell is the most significant bit and has a value of 2.sup.7 =128.
As shown in FIG. 3, an exemplary Code One symbol character having a value of 66 (2+64), corresponding to the ASCII value "A" in the Code One symbology, is encoded by selecting the appropriate cells in a character to be black cells that together total the desired value. The ASCII values in the Code One symbology are equal to the standard ASCII values in the computer industry plus one; the ASCII value for "A" in the computer industry has a value of 65, which therefore has the value of 66 in the Code One symbology.
In a given Code One symbol, the symbol characters are ordered in a row-wise fashion from left to right, and the "rows" of symbol characters are arranged from top to bottom in a symbol. Each row of symbol characters in a Code One symbol consists of a pair of adjacent rows of data cells. The first symbol character in the Code One symbol is in the top left corner of the symbol and the last symbol character is in the bottom right corner. For certain versions of Code One symbols, some symbol characters are not contiguous, but are instead split by the finder pattern 22 or by one of the vertical reference patterns 24. Readers analyze the symbol characters in a Code One symbol from the first symbol character in the symbol's top left corner rightward to the right edge of the top row, and then from the left edge rightward along the second row, and so forth.
Assuming a reader encounters no difficulties, each symbol character analyzed in a Code One or other symbol is convened into corresponding data to be used by the reader, the user, or other peripheral equipment. Unfortunately, data encoded under nearly all symbologies can result in errors when decoded by a reader. Errors are often caused by poor print quality in a symbol, poorly-designed reading equipment, and so forth. Some linear symbologies are designed to reduce such errors. For example, the ratio of widths between narrow and wide "elements" (i.e., bars or spaces) in the Code 39 and Interleaved 2 of 5 symbologies are established so that known algorithms can distinguish between elements despite variations in the widths of the given elements.
To also reduce errors, certain symbologies include check characters. A check character is a character included within a symbol whose value is used to perform a mathematical check that determines whether the symbol has been decoded correctly. For example, Code 39 has an optional modulo 43 check character that can be included as the last symbol character in a symbol. The Code 39 check character is calculated by determining a character value for each data character in an original message, adding together all of the character values, and dividing the sum by 43. The check character becomes the remainder that results from such division, and is appended to the end of a symbol encoded from the message. A "character value" is a number representing a data character in a given symbology, For example, in the Code 39 symbology, the character "A" has a character value of "10."
As an example, the data character message "CODE 39" has the respective character values of 12, 24, 13, 14, 38, 3, and 9. The check character for this message is computed as follows: ##EQU1## Thus, the check character for the data character message "CODE 39" has a character value of 27, which corresponds to the data character "R."
Other symbologies improve upon the use of check characters by employing error correction characters. Error correction characters, as with check characters, are calculated mathematically from the other symbol characters in a symbol. Error correction characters are symbol characters in a symbol that are reserved for erasure correction, error correction, and/or error detection. An erasure is a missing, unscanned or undecodable symbol character: the symbol character's position is known, but not its value. An erasure can result from portions of a symbol having insufficient contrast, a symbol that falls outside a reader's field of view, or a portion of which is obliterated. An error is a misdecoded or mislocated symbol character; both the position and the value of the symbol character are unknown. An error can result from random spots or voids in a symbol when the symbol is printed.
For an error, the error correction characters allow a reader to use these characters in a symbol to locate and correct errors that have unknown values and locations. Two error correction characters are required to correct each error: one error correction character to locate the erroneous symbol character and the second error correction character to determine what value the erroneous symbol character should have been. For an erasure, the error correction characters allow a reader to use these characters to correct erroneous or missing symbol characters that have known locations. Consequently, only one error correction character is required to each erasure.
For error detection, the error correction characters allow a reader to use these characters in a symbol to detect the number of errors in the symbol that exceed the error correction capacity for the particular symbology. Error detection cannot correct the errors in the symbol, but can prevent a symbol from being decoded and producing erroneous data. Error correction characters can be reserved for error detection, and in most linear symbologies, such as Code 39, these characters are referred to as check characters, as discussed above.
Some symbologies, such as Code One, have many error correction characters. The Code One symbology is specifically designed with 27% to 50% of the symbol characters allocated to error correction. Consequently, the Code One symbology allows for very secure decoding that mathematically is many orders of magnitude more accurate than linear bar code symbologies that simply use check characters. As shown in FIG. 1, the Code One symbol 20 includes symbol characters 26 that begin at the top left corner of the symbol and error correction characters 28 that end at the bottom right corner of the symbol.
Since each version in the Code One symbology has a fixed symbol size, pad characters 30 are inserted between the symbol characters 26 and the error correction characters 28 to fill out a symbol that does not have enough symbol characters to completely fill in the symbol. The error correction characters 28 are algorithmically generated using standard Reed-Solomon error correction methods based on the data and pad characters 26 and 30. If a portion of the symbol 20 contains errors or erasures (i.e., damage), the symbol may in some cases be decoded based on the error correction characters 28.
Many error correction algorithms perform computations roughly analogous with solving linear equations wherein with two equations and two unknowns, one can readily compute the two unknowns. As noted above, the error correction characters 28 are computed using several equations with the symbol characters 26 and the pad characters 30. Therefore, using the several equations that generated the error correction characters 28, and undamaged symbol, pad, and error correction characters, a reader may determine the values of unknown symbol, pad, and error correction characters having erasure damage by solving the equations (if the number of unknown characters does not exceed the number of equations). Consequently, if a few symbol characters 26 are damaged, the remaining symbol characters 26, pad characters 30, and error correction characters 28 can be used to correct the damaged symbol characters, and likewise, damaged pad or error correction characters may be corrected based on the remaining symbol, pad, and error correction characters. Overall, there is a tradeoff between damaged symbol, pad and error correction characters 26, 30 and 28, and undamaged characters in a symbol that can be corrected under error correction algorithms.
The extent of any damage recoverable by the error correction characters 28 depends upon the amount and type of damage suffered by the symbol 20. As a general rule, if the damage to the symbol is an erasure, e.g., a portion of the symbol is obscured or lost, standard Code One readers can recover an area of erasure in the symbol that is approximately equal to an area remaining of the error correction characters in the symbol. For the symbol 20, if the portion of the symbol rightward of the dashed line A--A in FIG. 1 were obliterated (i.e., an erasure), the total area of this portion, above and below the finder pattern 22, is less than the total area of the error correction characters 28 that remain. Therefore, such a damaged symbol 20 can be corrected and a reader can replace the lost symbol characters and pad characters lost in the obliterated portion.
The obliterated portion rightward of the line A--A, however, is approximately equal to the maximum extent to which known error correction algorithms can recover for such lost characters. An equation for the maximum erasure damage that can be corrected under known error correction algorithms (ignoring the finder pattern 22) can be represented as follows: EQU E &gt;A.sub.ERASE
where E is the total area of error correction characters remaining in the symbol and A.sub.ERASE is the total area in the symbol lost due to erasure damage.
If the portion were not an erasure, but an error, the area of the portion having errors must be equal to approximately half the area of the remaining error correction characters because the remaining error correction characters must determine both the location of the errors and the correct value for each error. An equation for the maximum error damage that can be corrected under known algorithms (ignoring the finder pattern 22) can be represented as: EQU 2(E)&gt;A.sub.ERROR
where A.sub.ERROR is the total area of the symbol suffering from error damage.