The present disclosure relates generally to a method of, and an arrangement for, electro-optically reading, by image capture, two-dimensional symbols having characters encoded with an encoding scheme, with an imaging reader operatively connected to a host system, and, more particularly, to insuring that the host system processes the characters using a compatible encoding scheme.
Solid-state imaging systems or imaging readers have been used, in both handheld and/or hands-free modes of operation, to electro-optically read, by image capture, two-dimensional bar code symbols having encoded characters to be decoded into binary data that is indicative of the encoded characters. A known imaging reader captures light scattered and/or reflected from the symbol, converts and processes the captured light into the binary data, and transmits the binary data to a remote host system for further processing, e.g., information retrieval from an information database.
The encoded characters of each symbol may be encoded with a local character set that is indicative of a local language, e.g., English, Chinese, Japanese, Korean, Hebrew, etc. A character set is a collection of characters, e.g., letters, numbers, punctuation marks, symbols, etc., that are used to represent and support a local language, or a set of languages that share a common writing system. In order for controllers, e.g., programmed microprocessors, in imaging readers and host systems, as well as computers in general, to sort, store, print, display, and process characters, the characters must be represented by numeric values. An encoding scheme, also called a codepage, is an organized table in which a numeric index, also known as a code point value, is assigned to each character in a certain order, thereby allowing a character to be distinctively identified by its corresponding code point value. Since various languages or language groups use characters, e.g., accented or entirely new letters, which other languages or groups do not use, different languages have their own different character sets or different local codepages that support them. Since some languages, such as English, French, German, Italian and Spanish, require less than 256 characters, they can be represented by a single byte (8 bits), also known as a single-byte character set (SBCS). Some Asian languages that use ideographic characters, such as Chinese (traditional and simplified), Japanese, and Korean, have many thousands of characters, and they can each be represented by two bytes (16 bits), also known as a double-byte character set (DBCS), or by multiple bytes having a variable number of bytes per character, also known as a multi-byte character set (MBCS).
The encoded characters of each symbol may also be encoded with a global character set, i.e., Unicode, which is indicative of multiple languages. Unicode is an open character set maintained as a computing industry standard by the Unicode Consortium and has a repertoire of more than 120,000 characters that includes substantially all of the world's writing systems. Unicode essentially unifies and internationalizes all of the local codepages and local character sets into a single, master character set or global codepage. One character encoding format capable of encoding all the possible characters in Unicode is the Universal coded character set+Transformation Format-8-bit (UTF-8), which has a variable length and uses 8-bit code units.
By international convention standard, the imaging readers transmit their binary data to a host system controller in accordance with the encoding scheme of each symbol in which its characters were encoded. The identity of the particular encoding scheme that was used to encode each symbol is not output from the imaging readers. In addition, the host system controller usually interprets the binary data from the imaging reader using the local codepage. For example, a host system controller operating in the United States would use the local codepage for English, whereas a host system controller operating in Japan would use the local codepage for Japanese. If the encoding scheme of the binary data matches that which the host system controller uses, then the further processing of the symbol is performed smoothly and accurately. However, if there is no match or compatibility, then the processing performance suffers, and typically fails. Further complicating the situation is that in many countries, e.g., China, Japan and Korea (CJK), each symbol can have its characters encoded either in accordance with the respective country's local codepage, or in accordance with the global codepage, e.g., Unicode. It is often preferred to encode the symbols in accordance with a local codepage, rather than the global codepage, because using the local codepage is faster and more efficient than using the global codepage, and is often preferred for communicating with legacy applications that only were encoded for use with local codepages. In any event, it is not known which of these local or global codepages was used in encoding each symbol, because, as described above, the identity of the particular encoding scheme that was used to encode each symbol is not output from the imaging readers. When the encoding scheme that the host system controller wants to use does not match the unknown encoding scheme of the encoded characters of the symbol being read, then the symbol will not be correctly or accurately processed, if at all.
Accordingly, it would be desirable to insure compatibility between the controllers of the imaging readers and their host systems, and to insure that the encoded characters are correctly and accurately processed and interpreted.
Skilled artisans will appreciate that elements in the figures are illustrated for simplicity and clarity and have not necessarily been drawn to scale. For example, the dimensions and locations of some of the elements in the figures may be exaggerated relative to other elements to help to improve understanding of embodiments of the present invention.
The method and arrangement components have been represented where appropriate by conventional symbols in the drawings, showing only those specific details that are pertinent to understanding the embodiments of the present invention so as not to obscure the disclosure with details that will be readily apparent to those of ordinary skill in the art having the benefit of the description herein.