1. Field of the Exemplary Embodiments of the Present Invention
The exemplary embodiments of the present invention relate generally to the field of interactive advertising, content download and related applications. In the exemplary embodiments of the present invention, the trigger for effecting an operation of information inquiry or content download is the user taking a picture/video shot of printed/displayed object which is graphical in nature, rather than material which is simply “machine readable” code.
The embodiments described herein are illustrative and non-limiting. Definitions are provided solely to assist one of ordinary skills in the art to better understand these illustrative, non-limiting embodiments. As such, these definitions should not be used to limit the scope of the claims more narrowly than the plain and ordinary meaning of the terms recited in the claims. With that caveat, the following definitions are used:
“Activation mechanism” refers to the method that activates the recognition of a “symbol” (defined below) of the “link” (defined below). Links are typically supplied by a publisher with some selection mechanism, which activates the detection of link's “frame” (defined below) recognition of the symbol inside the link's frame. There are many types of activation mechanisms include, among others, pressing a button on the imaging device, zooming on the link, or voice activation.
“Banner” means any printed object surrounded by a “frame” (defined below), which is to be activated by the user in an imaging operation.
“Computational facility” means any computer, combination of computers, or other equipment performing computations, that can process the information sent by the imaging device. Some examples would be the local processor in the imaging device, a remote server, or a combination of the local processor and the remote server.
“Content” is a service or information prepared by the publisher and associated with the link and available to the imaging device user via activation mechanism of the link. For example, if a publisher has placed an image of an actor on a webpage, then the user pressing a button on an imaging device while taking a picture of the actor results in display of actor's biography or a coupon to his latest movie.
“Displayed” or “printed”, when used in conjunction with an imaged document, is used expansively to mean that an object to be imaged is captured on a physical substance (as by, e.g., the impression of ink on a paper or a paper-like substance, or by embossing on plastic or metal), or is captured on a display device (such as LED displays, LCD displays, CRTs, plasma displays, ATM displays, meter reading equipment or cell phone displays).
“Form” means any document (printed or displayed) where certain designated areas in this document are to be filled by handwriting or printed data. Some examples of forms are a typical printed information form where the user fills in personal details, a multiple choice exam form, a shopping web-page where the user has to fill in details, and a bank check.
“Frame” is some easily recognizable feature surrounding a symbol. Typically the frame is a rectangular color box around an icon or an image. However, frame may also be non-rectangular (e.g., cloud shape or hand-written closed line), non-connected (e.g., dash-line or corners markers), not colored (e.g., using page background colors, or texture), or differing in other ways from a rectangular color box around an icon or image. Moreover, the frame, in the sense referred to in these exemplary embodiments of the present invention, need not surround the symbol (that is to say, the frame itself can be another small easily recognizable symbol, such as a computer mouse “arrow” icon, on a predefined offset from the symbol of interest).
“Image” means any captured view or multiplicity of captured views of a specific object, including, e.g., a digital picture, a video clip, or a series of images.
“Image recognition” means an array of algorithms for recognizing various objects in images and video data. These algorithms include, among others, optical character recognition (OCR), optical mark recognition (OMR), barcode recognition, alphanumeric data detection, logo and graphical symbols recognition, face recognition, and recognition of special marks.
“Imaging device” means any equipment for digital image capture and sending, including, e.g., 3G videophones, a PC with a webcam, a digital camera, a cellular phone with a camera, a videophone, or a camera equipped PDA, video conferencing device, a personal computer tethered to a camera, or a laptop with 3G modem card. Note that the imaging device may be digital, or may be analog (such as a TV-camera with a Frequency Modulated link), as long as the imaging device is connected to the computational facility.
“Link” is a symbol logically associated with some data or operation. Typically, selecting a symbol of the link activates dedicated software on the computational facility, resulting in the user receiving service associated with the symbol of the link.
“Network connectivity” means a connection to a one-to-one or one-to-many data transmission network. Examples of the transmission network to which connectivity might be applied include a wireless 2G or 3G network, the Internet, an Ethernet network, and a private data network (where the private data network might be used, e.g., for security purposes). The connectivity could be achieved in any number of ways, including, e.g., wireless communication, cable, an Infrared connection, Bluetooth, a USB connection, a Firewire connection, or a WiFi link.
“Publisher” means a person or organization responsible for content preparation, including designing an object on which a link appears. Typically a publisher designs a page with symbols acting as links to some predefined information, updates the services considered by these exemplary embodiments of the present invention, and distributes the page with the links to the potential users of imaging devices.
“Symbol” refers to some well defined and recognizable object and/or feature that is visible to the imaging device. Typically symbols appear in the form of logos, icons or thumbnails, as a part of a designed page. In printed media symbols may be part of page design, and in displayed media symbols may be part of a webpage. A symbol may also be any text or graphical object or handwritten mark. A symbol may be printed on a 3D object (e.g., a shirt), or may be any other objects visible by the imaging device. In these exemplary embodiments of the present invention, “machine readable” symbols are those which are optimized for decoding by specific machinery, cannot be easily understood by a human observer, and must be forcefully introduced into the page design from “natural symbols”. In contrast, the “natural symbols” are those which may be easily recognized by a human and are naturally integrated into the page design.
“User” refers to the imaging device user. The imaging device user may be a human user, or an automated or semi-automatic system. An automated or semi-automatic system could be, e.g., a security system, which could be fully automated (meaning without direct human intervention) or semi-automatic (meaning that the system would be tied directly to specific people or to specific functions conducted by humans).
“Video call” means two-way and one-way video calls, including, e.g., calls performed via computers with web-cams. Any connection performed by an imaging device with a data connection and with video capture and sending capabilities, could be included with the definition of “video call”. The video call is performed from a user to a computational facility, which takes actions according to the video data. Examples of video call protocols are the H.323 and 3G-324M video conferencing protocols, the IMS/SIP standard, and the proprietary protocols used in some software video conferencing clients (CuSeeMe, Skype, etc.).
“Video data” is any data that can be encapsulated in a video format, such as a series of images, streaming video, video presentation, or film.
2. Description of the Related Art
One of the most powerful features of the Internet as a medium for browsing and information retrieval is the ability to connect every piece of graphics or text in a web page to a new link or URL. This feature also applies just as well to browser based information from other data sources (a data source which might be, e.g., a Corporate Intranet). This ability to connect gives the user great control and ability to navigate between subjects, and to make choices and create new information, all of this without any typing (that is, just by pointing and clicking).
On the printed media and on static displayed media (i.e., a TV screen, a DVD player screen, or a computer screen where the user has no access to a keyboard/mouse), such capabilities do not exist. (Such printed and static display media will be referenced as “passive media”.) This is a problem in the use of passive media. In order to ameliorate this problem, many different and interesting technologies have evolved in the recent decade to try and provide a “browsing-like experience” for these passive media. Some of those known technologies, in the form of systems and methods, are:
System 1: A system where each object in the printed medium is tagged with a unique numeric code. This code can by typed or read into the system by the user, and then used to access the related information. Such a technology has been promoted by a company called MobileSpear, Ltd, and also by companies such as Bango.net Limited.
System 2: A system where standard well-known, or possibly standardized, machine readable code, such as, e.g., a one-dimensional UPC barcode, has been added next to printed material. In this case, the user may use a standard barcode reader or other imaging device to read/scan the code. The system then utilizes a connection of this reader or device to a data transmission device, so that information related to the scanned code may be retrieved. This system and method are promoted by companies such as Airclic, Neomedia, and Scanbuy.
System 3: A system utilizing special or proprietary machine readable codes in a similar manner to System 2, the main difference from System 2 being that these new codes are smaller, more robust, have stronger visual appeal, are easier to spot, and/or are more easy to decode by devices the users may have in their possession. That is, the specific encoding method used in these codes has been optimized for the constraints and requirements of the specific machine decoding hardware and software utilized by the users. For example, if the device used to image the code is a mobile handset with imaging capabilities, then there will be a specific coding method for that specific type of device. There exist many different implementations of this sort, a few of which are:
Implementation 1 of System 3: QR codes are two dimensional barcodes developed by Denso-Wave of Japan, used by most Japanese cellular phones.
Implementation 2 of System 3: Semacodes are different two dimensional barcodes designed for easy decoding by present day Western camera phones. This technology is developed by a company called Semacode, Inc.
Implementation 3 of System 3: Watermarks are two dimensional machine codes that are encoded using low intensity spread spectrum variations on the initial image, and which may be decoded using low quality digital imaging devices. Unlike the previously described machine codes, watermarks generally are not visible to the naked eye. Such a system has been developed by, among others, Digimarc, Inc.
All of the systems described above share the following key drawbacks:
Drawback 1: Due to the very many different possible data encoding methods, it is hard for the content publisher to decide which system to incorporate into the printed displayed content. Obviously, trying to accommodate all the different methods is impractical, as most of the visible “page” would be taken by the different machine readable codes.
Drawback 2: The visible machine readable codes present a visually non-appealing concept, and interfere with the graphic design of the content, as well as taking up valuable space on the page/screen.
Drawback 3: The non-visible codes (that is, watermarks as described above) require high quality printing/display and take relatively large areas (e.g., half or all of a whole page) for reliable encoding. Furthermore, since they are non-visible, it is hard for the user to know if these codes exist in a specific page and where to aim the scanner/phone/camera in order to scan these codes.
Drawback 4: The required alterations (adding codes and/or watermarks) to the content imply that this “linking” process must take place during the content preparation, and that certain rules must be kept while printing/distributing the content. For example, bar codes cannot be decoded if they are on the fold area of a newspaper/magazine, and/or if they are on glossy paper. (Watermarks on an advertisement will be decodable if printed on the high quality high density glossy printing used in magazines, but not if the same advertisement is printed in a regular newspaper.) Furthermore, it is impossible to “link” content which has been printed in the past before the incorporation of the chosen new codes.
Drawback 5: In most of these systems, there are uncertainties related to intellectual property issues, including whether royalties for use of the system should be paid, and if so to whom. These issues all stem from the use of special machine codes that have been developed and patented by some entities. Thus, there is a great preference to being able to decode printed/displayed content without requiring any alterations to the content.
A company called Mobot, Inc., claims to have developed technology which allows the scanning/decoding of full page ads from a high quality magazine. This technology has the following drawbacks:                Drawback 1 of Mobot technology: Since only full page color magazine pages can be used, a large portion of the print media (newspapers, black and white publications, tickets, packaging, labels, envelopes, etc.) cannot be used, and similarly content displayed on a screen cannot be used.        Drawback 2 of Mobot technology: Since only a whole page is used, it is impossible for the user to select a specific part of the page. Hence, small advertisements, or different sections of an article/picture/advertisement, cannot be used. This limitation is equivalent to having only one clickable URL per Web page on the Internet, and makes the entire approach commercially unattractive.        Drawback 3 of Mobot technology: Often a page in a magazine/newspaper would include more than just the link symbol, with the result that even if other objects on the page are not to be linked, the user will not understand which link on the page he or she is currently activating.        Drawback 4 of Mobot technology: When a whole page is used, in many cases the user will take a picture which contains just a part of the page for convenience reasons (e.g., the magazine is on a table, or the user is sitting and does not wish to get far from the medium to image whole page). Then, in a setting where many advertisements are already in the database, it may become even more difficult, really impossible, to differentiate between a photo of a part of one page and a photo of part of a different page. For example, imagine the user taking a picture of the part of a page containing a corporate logo and some uniform colored background, as would often appear at the bottom of many advertisements. This kind of photo could be generated from many different advertisements of the same company, and thus one could not know which advertisement specifically the user is looking at.        
Some notable examples of systems utilizing machine readable codes include:                Example 1: Proprietary “machine code” readers, such as a product known as “the CueCat”.        Example 2: Standard barcode readers such as those manufactured by Symbol, Inc.        Example 3: Still digital cameras, and tethered or web cameras. These provide high quality images, and are thus especially suitable for watermark scanning. Such web cameras are used by companies such as Digimarc for decoding watermarks.        Example 4: Portable imaging devices such as cellular phones with cameras, PDAs with cameras, or WiFi phones with cameras. These devices have the great advantage of being portable and of having a wireless data transmission capability. Thus, the user can access the content in a great variety of locations and while using a personal device carried with the user at all times.        