1.0 Radio Frequency Identification (RFID) Tags
Radio frequency identification (RFID) tags are electronic devices that may be affixed to items whose presence is to be detected and/or monitored. A variety of tag classes have been defined by national and international standards bodies (e.g., EPCGlobal and ISO). The tag classes include Class 0, Class 1, and Class 1 Generation 2 (“Gen 2”). The presence of an RFID tag, and therefore the presence of the item to which the tag is affixed, may be checked and monitored wirelessly by devices known as “readers.” Readers typically have one or more antennas transmitting radio frequency signals to which tags respond. Because the reader “interrogates” RFID tags, and receives signals back from the tags in response to the interrogation, the reader is sometimes termed as “reader interrogator” or simply “interrogator.”
With the maturation of RFID technology, efficient communication between tags and interrogators has become a key enabler in supply chain management, especially in manufacturing, shipping, and retail industries, as well as in building security installations, healthcare facilities, libraries, airports, warehouses etc.
In addition, tags include limited amounts of memory for encoding user data. Existing standard data formats (e.g., as specified by ISO/IEC 15961 and 15962) do not offer good compaction efficiency, nor do they offer fast random access to a desired data element. In addition, Gen 2 standards limit the data systems which can be used to label data items. This limits the ability of users of Gen 2 tags to encode data items. Some users may desire to use GS1 Application Identifiers (AIs), whereas others may want to use Data Identifiers (DIs), and others may want to intermix the two. Furthermore, the Gen 2 air interface protocol does not provide a good mechanism for accessing a variable amount of memory, without requiring multiple operations of the same tag. In current Gen 2 implementations, the only options are (1) read the entire memory bank, which may entail reading a very large number of useless ‘0’ bits thus slowing down the process for reading a population of tags, or (2) read a selected number of memory words. The problem with alternative (2) is that if too many words are requested, the tag returns an error code with no indication of how many words were actually available.
2.0 Optical Media
Optical media such as bar codes are machine readable representations of information, often dark ink on a light background that creates high and low reflectance which can be converted to a digital format. Barcodes may represent or encode data by the widths and spacings of printed parallel lines, patterns of dots, concentric circles, and text codes hidden within images. Barcodes are often read by optical scanners called barcode readers or scanned from an image by special software.
Barcodes are widely used to implement Auto ID Data Capture (AIDC) systems that improve the speed and accuracy of computer data entry. Barcodes are typically extremely accurate and inexpensive. However, the amount and type of data that can be encoded in a bar code is limited.
The drive to encode more information in combination with the space requirements of simple barcodes led to the development advanced bar codes such as stacked barcodes and 2D barcodes. For example, matrix codes, a type of 2D barcode, do not consist of bars but rather a grid of square cells. Stacked barcodes are a compromise between true 2D barcodes and linear codes (also known as 1D barcodes), and are formed by taking a traditional linear symbology and placing it in an envelope that allows multiple rows.
3.0 Optimizing Data Encodation
Many media, such as high capacity optical media (such as 2D bar codes) and RFID tags (such as EPCglobal Gen 2 tags), share a need for optimizing the encodation of the data sets typically used in AIDC applications.
For example, existing standard RFID formats (e.g., ISO/EIC 15961 and 15962) and barcode encodation methods (e.g., Data Matrix, ISO/IEC 16022) do not offer good compaction efficiency or fast random access to a desired data element. In optical-media applications, available “real estate” for the optical mark is usually the motivating factor for improving encoding efficiency. In the case of RFID applications, on the other hand, there are two prime motivators: the need to fit the data within a fixed and limited amount of Read/Write memory on a particular tag, and the need to minimize the number of data bits that must be transferred over the relatively-slow air interface.
A particularly-important metric for evaluating encoding schemes for AIDC applications is the worst-case number of bits needed to encode data fitting specific application rules and typical usage. For example, two of the most common AIDC data sets are a GS1 Lot Number (Application Identifier 10) and a GS1 Serial Number (A.I. 21). Both of these are defined to use up to 20 Alphanumeric characters from the 82-member character set defined in ISO/IEC 646. However, in actual use, most applications define their Lot and Serial Numbers to contain only digits and capital letters. Therefore, optimized AIDC encoding methods need to address both the absolute worst case (a 20-character data string using full ISO/IEC character set), and the typical worst case (a 20-character string using only digits and capital letters).
Until recently, the available encodation schemes were far from optimal for real-world AIDC data, especially for the typical worst case scenario. To address this need to minimize the number of encoded bits, a multi-base encodation scheme was developed. For example, see the detailed discussion of Packed Objects in U.S. Pat. No. 6,196,466, filed Jun. 9, 1999, entitled “Data Compression Method Using Multiple Base Number Systems” (hereinafter the '466 patent); U.S. patent application Ser. No. 11/806,050, filed May 29, 2007, entitled “Data Format for Efficient Encoding and Access of Multiple Data Items in RFID Tags” (hereinafter the '050 application); and U.S. patent application Ser. No. 11/806,053, filed May 29, 2007, entitled “Data Format for Efficient Encoding and Access of Multiple Data Items in RFID Tags” (hereinafter the '053 application), each of which are incorporated by reference herein in its entirety.
To further address the need to minimize the number of bits to be transmitted over an interface, multi-base encodation within an overall encoding format and structure can be used to provide many encodation efficiencies for known-numeric data (such as for GS1's A.I. 00) and additional transmission efficiencies. Examples of these techniques are described in the '050 and the '053 applications.
Packed Objects are one of the techniques to improve the efficiency of encoding, transmission and decoding. Packed Objects, as described in the above references, allow a receiving system to examine only the initial bits of a set of encoded data items to determine whether a data set of interest is present instead of reading all the bits in search of the data item. Thus, when reading large numbers of tags (some without the data item of interest) in search of a data item, the average number of transmitted bits is reduced.
The encoding efficiency of a AlphaNumeric (A/N) section of a Packed Object, based on multi-base encoding is provides significant improvements over traditional AIDC encoding methods for their worst-case scenario, which is a random mix of letters and numbers. These traditional methods classify input characters into a number of subsets (such as for digits and for uppercase letters), where these subsets need to include numerous “switches” and “latches” to alternate subsets, which reduces encoding efficiency in two ways. First, these non-data switches and latches increase the number of members needed in each subset (which increases the number of information bits needed to represent the data characters of the sub, thus reduces the encoding efficiency of the subset). Second, these traditional switches and latches require the same number of encoded bits as do the data characters of the starting subset, thus, for example, it costs five bits to latch out of a five-bit subset (for letters) to a four-bit subset (for digits), and another four bits to latch back from digits to letters.
Further, traditional AIDC methods define one or more sets of fixed-size output patterns, which in the case of optical media, use an integral number of bars and spaces, or in the case of bitstream-encoded media such as RFID tags, use an integral number of bits for each defined output character or defined grouping of two or three output characters. Since each output pattern, character, or group represents a non-integral number of abstract information bits, efficiency is reduced by integral representations of the data. For example, each digit of a set of decimal (base 10) digits represents 3.3219 (ln 10/ln 2) bits of information. When conveying decimal digits in a four-bit output grouping, for example, only 83 percent efficiency is achieved.
Multi-base encoding is not restricted to integral numbers of output bits, and does not rely on encoding switches and latches in order to mix digits and letters. Therefore the worst-case encoding efficiency of multi-base encoding is superior to traditional AIDC methods.
“Code 5” encoding is currently an active proposal in the AIDC community. It defines output character sets of 4, 5, 6, 7, and 8 bits, plus predefined output groupings such as 3 digits in 10 bits, and each set contains numerous switches and latches to the other Code 5 set.
Because of the high cost of traditional switches and latches, the most efficient way to encode mixed letters and digits in Code 5 (such as the data string “A12B34C56”) is to use Code 5's six-bit code set for the entire string. Including four bits of overhead to latch into the six-bit code set, plus six bits to individually represent each of the nine data characters, this string requires a total of 58 bits to encode in Code 5 (averaging 6.44 bits per character). In contrast, the A/N encoding used in the Packed Objects specification requires only 48 bits (averaging 5.33 bits per character): four bits of overhead to define the particular characteristics of this instance of multi-base encoding, a 9 bit character map (e.g., “100100100”) where each ‘1’ or ‘0’ indicates the positions of an individual letter or digit (respectively) within the data string, 20 bits to encode the six digits of the string (converted to a single base 256 value), and 15 bits to encode the three uppercase letters of the string (converted to a single base 256 value).
In general, the worst-case mixes of digits and uppercase letters in Code 5 all require an incremental cost (i.e., the average cost to encode another character) of 6.0 bits per character, excluding start-up overhead, which becomes less significant for longer source messages. Using multi-base encoding, the worst case mixes of digits and uppercase letters require only 5.11 bits per character, excluding start-up overhead.
Worst-case metrics are of particular importance to users because they answer the question of whether a user's data sets will always fit in the available storage (e.g., in a bar code format or the RFID memory). The metrics for average encodation efficiency are also of interest. Accepted industry statistics for the distributions of alphanumeric string contents (the typical mix and sequence of letters and digits) do not exist, however, so one currently cannot predict true “average” performance of an AIDC encodation method in the field.
For some particular mixes and sequences of messages, pure multi-base encoding may be less efficient than traditional methods, unless the multi-base encoding is augmented with additional techniques. An all-numeric message will have an incremental cost of 4.32 bits per character under multi-base encoding versus four bits for a method that can latch to a four-bit character set. For example, the Packed Objects structure allows many important data fields known in advance to be all-numeric to be encoded without a character map, thus achieving an incremental cost of only 3.32 bits per character—the optimal encoding for digits according to information theory.
Also, the header structure for the A/N section in a Packed Objects allows optional definition of a Prefix, Suffix, or Infix, in which the character map can be omitted for long substrings of characters from the same numeric base. Instead of encoding that portion of the character map, that portion (of all ‘0’s or all ‘1’s) is run-length encoded. Note that unlike traditional run-length encoding schemes, which provide a compacted representation of a string of identical data characters, or classes of data bit patterns, the run length in a Packed Object provides a compacted representation of the character map, not of the data content. Also, the Prefix/Suffix/Infix mechanism is different from and superior to traditional latching and shifting mechanisms.
Latches and shifts need to be a defined part of the current character set in order to be invoked from that set. Consequently, the shift/latch facility adds to the size of each character set, and therefore reduces the encoding efficiency of every character set in the scheme, even if never invoked for a given data set. In contrast, the Prefix/Infix/Suffix mechanism is defined outside of the character sets and is not encoded as part of any character set, and thus has no negative incremental efficiency penalty unless invoked.
Although such additional techniques do provide significant additional encoding efficiencies, the Prefix/Infix/Suffix mechanism used in Packed Objects is somewhat limited in the percentage of data mixes in which it is most useful. For example, if the first four characters of a data string include both letters and digits, then a Prefix cannot be used. As another example, the Infix mechanism includes a pointer to the starting position of the Infix within the data stream, which costs several additional overhead bits. Thus, the Infix mechanism is most beneficial when the data has particularly long runs of the same base. Finally, because Prefix/Infix/Suffix must contain either only digits or only non-digits, many data strings cannot use the Prefix and/or Suffix at all (because the string starts and/or ends with mixed characters), and the length of the allowable Prefix, Suffix, or Infix rarely constitutes a significant percentage of a large message due to the odds that at least one character would occur that would interrupt the single-base run.
Thus, what is needed is a system and method for encoding A/N and other mixed data strings when data strings contain substrings that are primarily, but not exclusively, from a single character class. Further, what is needed is an enhanced Prefix/Infix/Suffix mechanism to better handle mixed data sets and longer data sets. Moreover, what is needed are new methods and systems to flexibly mix ID tables of different sizes for different data systems in Packed Objects to help maintain backward compatibility.