The present application relates generally to technologies for encoding and decoding matrix code symbols which comprise multi-lingual text.
Matrix code symbols such as data matrix codes or QR code are widely used for storing text or data. Examples of the matrix codes symbols include two-dimensional (2D) and three-dimensional (3D) matrix codes. The 2D matrix codes are commonly referred as 2D barcodes. In 2D barcode systems, the data is encoded in a matrix of black and white cells which represent “0”s and “1”s. The text and data can be encoded in the matrix using various encoding techniques such as the American Standard Code for Information Interchange (ASCII). ASCII uses a 7-bit encoding scheme to define 128 characters. The ASCII values of English characters are between 000 and 127. Each English character is encoded by one codeword with codeword values ranging from 1 to 128, which are their respective ASCII values plus 1. It takes one byte in ASCII value to represent each English character.
One drawback of the ASCII standard is that it was limited to a single Latin-based language such as English. Unicode was introduced to represent other languages that were difficult to represent using the 128 character set. Unicode supports multilingual computer processing by representing each character with 2 bytes, which consumes a lot of space to represent text in the two dimensional matrix code. Moreover, the amount of information that the 2D data matrix can hold decreases when the text comprises multiple languages such as Arabic and English, or Japanese and French.
There is therefore a need for a method to provide encoding and decoding of bilingual text in matrix code symbols with increased data capacity compared to conventional matrix code techniques.