Wireless devices are increasingly capable of accessing the internet. When the user of a wireless device encounters a universal resource locator (URL), also known as a uniform resource locator, that is written on a billboard, or written on paper, or written in some other print or advertising context, the user typically must type the URL in his device in order to access the web site in question. This can be a tedious and time-consuming process. The user has to type in the addresses manually, and sometimes long addresses cause annoyance.
A user of a wireless device may be outdoors, perhaps at a bus stop, where there is an advertisement containing an interesting URL. In such an outdoor situation, it can be very awkward and distracting to use a keyboard to type the URL. Moreover, the user would be likely to make typing errors that can be frustrating, and may therefore access an incorrect web page such as a web page indicating that the requested server cannot be found.
These problems suggest that improved automated entry of URLs would be useful. Wireless methods and devices have already been invented that include a camera in the mobile device, the mobile device being wirelessly connected to a server. The server extracts text from an image captured by the camera. See Aarnio (U.S. Pat. No. 6,522,889). However, such inventions are directed at determining a user's geographic location, or at providing language translation, and are not well-adapted to reading or accessing internet addresses without extensive assistance from a server that must be accessed via the wireless network.
Technology has also been developed to allow the user of a mobile device to scan a barcode and thereby obtain automatic access to an internet site indicated by the barcode. However, that type of system has disadvantages too. For example, the bar code must be in close proximity to the scanner, and of course the barcode itself must be provided, instead of or in addition to a plain text URL. Even if an optical character reader is used instead of a bar code scanner, such a reader must still scan the URL text in close proximity to the text according to the existing technology, and the reader must be at a particular orientation with respect to the text, and furthermore the reader must be an automatic holographic laser scanner instead of a camera. See Wilz (U.S. Pat. No. 6,505,776, column 21). Other mobile image scanners could be used, but they present the same or similar problems. See FUJITSU Sci. Tech. J, 34, pp. 125-132 (September 1998).
Generally speaking, digital cameras that shoot arbitrary scenes and landscapes have been unsuitable for collecting character information from documents or signs or the like. Infrared digital cameras have been employed for character collection in smart pens, for example in the Ericsson ChatPen which shines infrared light on dots created by the ChatPen itself, but cameras have not been used in more general contexts, where sophisticated scanners such as laser scanners have been needed.
Text can be acquired from images by the well known process of optical character recognition (OCR). Yet, a major limitation of OCR software is that a directly frontal image of the document has normally been required. The automatic recognition of text in arbitrary scenes, where the text may or may not be perpendicular to the line of sight, is a developing science that needs to be more fully exploited and tailored to recognize particular character strings. See “Locating Text in Indoor Scenes,” by Mirmehdi and Clark (available at www.cs.bris.ac.uk/Research/Digitalmedia/docum.html).
OCR generates character codes to match the pictured text. If OCR works perfectly, the character codes exactly match the text that has been scanned or photographed. However, scanning is part science and part art. Errors are inevitable, and therefore a good OCR system for acquiring internet URLs is needed for coping not just with situations where the OCR works perfectly, but also for situations where OCR does not initially work perfectly.
Optical Character Recognition (OCR) works by analyzing glyphs (a glyph is the visual image of a character) in order to yield character codes for alphanumeric or punctuation characters. Glyphs (images) and characters (symbols) are two linked but distinct concepts. The unique visual aspects of URL glyphs have yet to be fully exploited, in the context of character recognition in a wireless environment.