This invention relates to producing, capturing and using visual identification tags for moving objects. More particularly, this invention addresses the need to identify one or more moving objects with the help of a standard digital camera, like a web-camera or video-frames of a mobile phone camera.
A bar code contains information represented by a linear series of spaced lines, wherein the width of the lines and spacing there between varies. The code can be scanned to retrieve information represented by the spacing. A problem associated with bar codes is that they are difficult to read at a distance, and can hold only a rather limited amount of information. In addition, they must be oriented properly in order to be read by a scanner. Two-dimensional barcodcs or matrix codes contain a greater amount of information but are even more difficult to read and align.
Closest to the present invention are the “MaxiCode” matrix code used by UPS Ref. 1) it uses black and white hexagons—and Microsoft's high capacity color barcode Ref. 2), which uses colored triangles as optical coding units. Capturing known 2D matrix codes with a low resolution digital camera fails under changing illumination conditions or when the target is too distant. None of these codes is able to identify reliably a variable number of tags present at the same time in a moving camera's visual field.
FIG. 1 illustrates two commonly used 2D matrix codes. The Data Matrix on the left and QR (Ref. 3) on the right code the Assignee name and address, as in the front page of this application. Note the typical anchors, here squares, which are used to register (move into standard position, or acquire) the tags. The anchors are found using template matching.
FIG. 2 illustrates the UPS “MaxiCode” for the same string as in FIG. 1. The use of black and white hexagons allows for a more economical use of space. The “Bull-Eye” anchor is used for locating and registering the tag. Note the white space between two adjacent black hexagons, used for segmentation.
FIG. 3 illustrates the Microsoft high density color tags in {Black, Yellow, Cyan, Magenta} space (2 bits per triangle). The white spaces between successive rows are used for deskewing and alignment and are an integral part of that invention. The tags can be generated and stored on a Microsoft dedicated web server. For details and capture instructions see Ref. 4).
The storage capacity of the disclosed type of visual reference tags is necessarily restricted by the fact that the tags must be relatively large, so that they can be captured reliably from a distance and in arbitrary rotational position. Among the exemplary embodiments disclosed herein, the storage capacity changes between 8 and 139 bits for rotational invariant codes. In this respect, visual reference tags face issues similar to RFID tags and can use similar techniques for extending their information content through additional external annotation. RFID tags and in particular their support systems are quite expensive and are often used for controlling or monitoring purposes.
In contrast, visual reference tags according to this invention do not require new infrastructure except software and network access: they can be printed on standard color printers, displayed occasionally, and captured through the low resolution video stream of any standard digital camera.
Consider a meeting where the participants wear their name tags. Name tags and visit cards are difficult to read reliably by mobile devices, partly because optical character recognition (OCR) uses computational resources heavily. Using RFID's, smart cards, and similar electronic devices requires additional equipment and might be considered privacy intrusive. Wearing a visual reference tag as disclosed in this invention, however, makes possible an easy and reliable recognition of participants, allows for automating conference services, and much more. Visual identification tags could provide information on demand at art and industrial exhibitions, service official and private parties, automate the identification of service personnel using the same weighing scale, cash registers copiers, and the like in the retail and service sector, improve surveillance and/or robot tracking systems, etc.
The current invention is based on a systematic analysis of all relevant issues concerning the effective recognition of visual symbols. Hence, the design of visual reference tags reflects the optimal image processing and machine learning methods needed to identify them. The most important innovations are: 1) the use of graph coloring strategy to enhance region identification and 2) the use of volume based visual cues for robust target acquisition based on a hue histogram matching. As a result, the tags can be identified at different resolution levels in only one sweep through the image. The system robustness is further increased by automatic color calibration, learning from examples, and run-time adaptation.
The invention discloses a method, a system, and products related to visual reference tags for tagging and subsequently identifying moving objects using low resolution digital cameras, typically a web-camera or a mobile phone digital camera. It discusses exemplarily a family of visual reference (REF) tags, in increasing order of size and information capacity. Different applications, like for instance coding a GPS coordinate and using the visual tags to navigate subway stations, supermarkets, etc., can make thus an optimal choice of which REF tag to use, larger reference tags coding more information but being more difficult to decode. If appropriate error correcting codes are included, the size of the tag is bounded in practice by the capture device resolution and the expected maximal number of tagged objects in its visual field.