There is a need to provide hyperlink functionality in known objects without modification to the objects, through reliably detecting and identifying the objects based only on the appearance of the object, and then locating and supplying information pertinent to the object or initiating communications pertinent to the object by supplying an information address, such as a Uniform Resource Locator (URL), pertinent to the object.
There is a need to determine the position and orientation of known objects based only on imagery of the objects.
The detection, identification, determination of position and orientation, and subsequent information provision and communication must occur without modification or disfigurement of the object, without the need for any marks, symbols, codes, barcodes, or characters on the object, without the need to touch or disturb the object, without the need for special lighting other than that required for normal human vision, without the need for any communication device (radio frequency, infrared, etc.) to be attached to or nearby the object, and without human assistance in the identification process. The objects to be detected and identified may be 3-dimensional objects, 2-dimensional images (e.g., on paper), or 2-dimensional images of 3-dimensional objects, or human beings.
There is a need to provide such identification and hyperlink services to persons using mobile computing devices, such as Personal Digital Assistants (PDAs) and cellular telephones.
There is a need to provide such identification and hyperlink services to machines, such as factory robots and spacecraft.
Examples include:
identifying pictures or other art in a museum, where it is desired to provide additional information about such art objects to museum visitors via mobile wireless devices;
provision of content (information, text, graphics, music, video, etc.), communications, and transaction mechanisms between companies and individuals, via networks (wireless or otherwise) initiated by the individuals “pointing and clicking” with camera-equipped mobile devices on magazine advertisements, posters, billboards, consumer products, music or video disks or tapes, buildings, vehicles, etc.;
establishment of a communications link with a machine, such a vending machine or information kiosk, by “pointing and clicking” on the machine with a camera-equipped mobile wireless device and then execution of communications or transactions between the mobile wireless device and the machine;
identification of objects or parts in a factory, such as on an assembly line, by capturing an image of the objects or parts, and then providing information pertinent to the identified objects or parts;
identification of a part of a machine, such as an aircraft part, by a technician “pointing and clicking” on the part with a camera-equipped mobile wireless device, and then supplying pertinent content to the technician, such maintenance instructions or history for the identified part;
identification or screening of individual(s) by a security officer “pointing and clicking” a camera-equipped mobile wireless device at the individual(s) and then receiving identification information pertinent to the individuals after the individuals have been identified by face recognition software;
identification, screening, or validation of documents, such as passports, by a security officer “pointing and clicking” a camera-equipped device at the document and receiving a response from a remote computer;
determination of the position and orientation of an object in space by a spacecraft nearby the object, based on imagery of the object, so that the spacecraft can maneuver relative to the object or execute a rendezvous with the object;
identification of objects from aircraft or spacecraft by capturing imagery of the objects and then identifying the objects via image recognition performed on a local or remote computer;
watching movie previews streamed to a camera-equipped wireless device by “pointing and clicking” with such a device on a movie theatre sign or poster, or on a digital video disc box or videotape box;
listening to audio recording samples streamed to a camera-equipped wireless device by “pointing and clicking” with such a device on a compact disk (CD) box, videotape box, or print media advertisement;
purchasing movie, concert, or sporting event tickets by “pointing and clicking” on a theater, advertisement, or other object with a camera-equipped wireless device;
purchasing an item by “pointing and clicking” on the object with a camera-equipped wireless device and thus initiating a transaction;
interacting with television programming by “pointing and clicking” at the television screen with a camera-equipped device, thus capturing an image of the screen content and having that image sent to a remote computer and identified, thus initiating interaction based on the screen content received (an example is purchasing an item on the television screen by “pointing and clicking” at the screen when the item is on the screen);
interacting with a computer-system based game and with other players of the game by “pointing and clicking” on objects in the physical environment that are considered to be part of the game;
paying a bus fare by “pointing and clicking” with a mobile wireless camera-equipped device, on a fare machine in a bus, and thus establishing a communications link between the device and the fare machine and enabling the fare payment transaction;
establishment of a communication between a mobile wireless camera-equipped device and a computer with an Internet connection by “pointing and clicking” with the device on the computer and thus providing to the mobile device an Internet address at which it can communicate with the computer, thus establishing communications with the computer despite the absence of a local network or any direct communication between the device and the computer;
use of a mobile wireless camera-equipped device as a point-of-sale terminal by, for example, “pointing and clicking” on an item to be purchased, thus identifying the item and initiating a transaction.
Disclosure of Invention
The present invention solves the above stated needs. Once an image is captured digitally, a search of the image determines whether symbolic content is included in the image. If so the symbol is decoded and communication is opened with the proper database, usually using the Internet, wherein the best match for the symbol is returned. In some instances, a symbol may be detected, but non-ambiguous identification is not possible. In that case and when a symbolic image can not be detected, the image is decomposed through identification algorithms where unique characteristics of the image are determined. These characteristics are then used to provide the best match or matches in the data base, the “best” determination being assisted by the partial symbolic information, if that is available.
Therefore the present invention provides technology and processes that can accommodate linking objects and images to information via a network such as the Internet, which requires no modification to the linked object. Traditional methods for linking objects to digital information, including applying a barcode, radio or optical transceiver or transmitter, or some other means of identification to the object, or modifying the image or object so as to encode detectable information in it, are not required because the image or object can be identified solely by its visual appearance. The users or devices may even interact with objects by “linking” to them. For example, a user may link to a vending machine by “pointing and clicking” on it. His device would be connected over the Internet to the company that owns the vending machine. The company would in turn establish a connection to the vending machine, and thus the user would have a communication channel established with the vending machine and could interact with it.
The decomposition algorithms of the present invention allow fast and reliable detection and recognition of images and/or objects based on their visual appearance in an image, no matter whether shadows, reflections, partial obscuration, and variations in viewing geometry are present. As stated above, the present invention also can detect, decode, and identify images and objects based on traditional symbols which may appear on the object, such as alphanumeric characters, barcodes, or 2-dimensional matrix codes.
When a particular object is identified, the position and orientation of an object with respect to the user at the time the image was captured can be determined based on the appearance of the object in an image. This can be the location and/or identity of people scanned by multiple cameras in a security system, a passive locator system more accurate than GPS or usable in areas where GPS signals cannot be received, the location of specific vehicles without requiring a transmission from the vehicle, and many other uses.
When the present invention is incorporated into a mobile device, such as a portable telephone, the user of the device can link to images and objects in his or her environment by pointing the device at the object of interest, then “pointing and clicking” to capture an image. Thereafter, the device transmits the image to another computer (“Server”), wherein the image is analyzed and the object or image of interest is detected and recognized. Then the network address of information corresponding to that object is transmitted from the (“Server”) back to the mobile device, allowing the mobile device to access information using the network address so that only a portion of the information concerning the object need be stored in the systems database.
Some or all of the image processing, including image/object detection and/or decoding of symbols detected in the image may be distributed arbitrarily between the mobile (Client) device and the Server. In other words, some processing may be performed in the Client device and some in the Server, without specification of which particular processing is performed in each, or all processing may be performed on one platform or the other, or the platforms may be combined so that there is only one platform. The image processing can be implemented in a parallel computing manner, thus facilitating scaling of the system with respect to database size and input traffic loading.
Therefore, it is an object of the present invention to provide a system and process for identifying digitally captured images without requiring modification to the object.
Another object is to use digital capture devices in ways never contemplated by their manufacturer.
Another object is to allow identification of objects from partial views of the object.
Another object is to provide communication means with operative devices without requiring a public connection therewith.
These and other objects and advantages of the present invention will become apparent to those skilled in the art after considering the following detailed specification, together with the accompanying drawings wherein: