Mobile communication devices with integrated camera module are further approaching the market and show up already a good market penetration and use in daily life, taking over the market of simpler mobile phones. Far beyond phoning and email reading they get used for taking pictures, for Internet access and for other services including WAP services and camera based services. With an increasing interaction between the user and the mobile communication device and with increasing complexity of input data, the interface between the user and the mobile communication device gets more important. Better mobile phones and mobile communication devices from Research in Motion (Blackberry) and from Nokia (Communicator) for instance comprise a more complete keypad as they were mainly intended to be used for email reception and email sending besides a normal phoning. Following and latest mobile communication devices for instance from Apple (iPhone), Nokia, Motorola and others comprise an even bigger screen as touch-screen which allows a more convenient Internet use, looking at pictures as well as such touch-screens give a much better data input possibility. The term “smart-phone”, for sake of clarity, is also found in the market and is represented herein by the term mobile communication device. Speech recognition via built in microphone and speech recognition algorithms can be used for selecting phonebook names and for a small amount of control commands. However in mobile communication devices the touch-screen or the built-in keypad is necessarily small and it is still cumbersome to input data as steadily new applications on the market require even more interaction with an increasing amount of data input.
As an integration of camera modules in mobile phones and in mobile communication devices is already state of the art, there easier text input possibilities would be appreciated, as for instance using the build in camera module for text recognition and for an optical input of text.
From EP-08 169 713 by the same inventors as of the current invention there is disclosed a method and a device for using a mobile communication device as a reading device for blind or visually impaired people. Therefore an image of the text page, which shall be read, is captured semi-automatically, the image data are then transformed via optical character recognition (OCR) to text data and the text data are then converted via text-to-speech software to audio data which are then output via a loudspeaker of the mobile communication device. The text data within a book page, a newspaper, a menu or the like get instantly recognized and read as an image of said text is captured, wherein a filtering inclusive shadow compensation, rotation of the text, binarization, unwarping and the like for a following OCR and text-to-speech conversion is processed. In essence EP-08169713 describes a method for image filtering of captured images containing text, wherein the image is captured with camera modules of a limited quality, as they are built in mobile phones or the like, adapted for optimized OCR results.
From EP-09 161 549 of the same inventors as of the current invention there is disclosed a method for capturing an image, processing the image data, extracting text and objects thereof and sending this data via the same camera based mobile communication device to a server system for translations, for best price searches, for getting information about the local position and the like. There are described data communication channels with possible applications, which require input of respective keywords or text-parts as search targets. But keyword or text-part as search target has still to be typed in manually via the keypad of the mobile communication device.
WO 2008/063822 A1 discloses and describes a method for a mobile communication device with integrated camera, wherein a video stream is analyzed in real time to detect a word within a central region of the image which is indicated by a cross-hair in the display. The so detected word is indicated then by a background color or by highlighting the word, which can be selected then as text input. This method is very useful as alternative text input possibility to a keypad. But the mobile communication device with its cross-hair has to be targeted on the desired keyword to be detected precisely which might be difficult sometimes. In fact, while holding a camera module, little angle variations of the camera module can effect big position variations of the captured text within the camera image. A keypress for capturing the image can often result in enough movement of the mobile communication device, such that the desired keyword is out of the center point of the captured image and so is not detected anymore.
Reading a phone number on a text page which shall be used then for a phone call requires sometimes typing a thirteen digit long number (inclusive country code) in, which might be cumbersome, in particular for some senior people having no good short term memory anymore. So an easy optical method for a number recognition out of a text page and the conversion of said number as usable input for a mobile phone or the like would be an advantage.
WO 2005/101 193 discloses a system comprising a scanner and at least one display and/or a speaker to provide the user of the scanner an indication of actions available for a portion of a document from which scanned information is obtained. Therein the scanned information of the document is used either to identify the document among documents in a database or to identify via markups in the document next possible user actions, indicating the next possible actions on the display or via loudspeakers. Said disclosure comprises an intelligent identification process for the scanned information of the document. But not foreseen therein is a method applicable on mobile communication devices for a quick determination and indication of keywords within a scanned text and highlighting the keywords for a further selection.
It would also be desirable to have a solution, wherein a user could hold a camera based mobile communication device over a text page whereupon an image containing the desired text-part is captured, the text-part gets detected and selected, whereof possible keywords get determined for a further selection of the desired keyword, and whereupon the further selected keyword gets taken as text input for a search application or for the like.