The present invention relates to a method and a module for multimodal text input on a mobile device either via a keyboard or in a camera based mode by holding the camera of the mobile device on a written text, such that an image is captured of the written text and the written text is recognized, wherein an input text or the recognized text, respectively, is output as the input text to an application receiving the input text.
Mobile devices such as mobile phones or smart phones with an integrated camera module and with a display such as a touch screen display, for instance, show a good market penetration and a use in daily life far beyond a simple phone application. The mobile device is used as well as a pocket book, a memo book, a calendar with planner, an address book, for receiving and writing SMS and emails and so on.
A standard mobile device offers already the integrated camera module with a resolution over 5 Megapixels sometimes even with optical zoom and it has a powerful microprocessor with over 300 MIPS (mega instructions per second). However, the text input on the mobile device for standard software applications seems sometimes cumbersome on a small keyboard or on the touch screen keyboard.
EP 2333695A1 discloses a camera based method for alignment detection of the mobile device to the written text on a sheet of paper or on a display, by analyzing the captured image of the written text. Immediate feedback by optical and acoustical feedback helps the user to align the mobile device faster and better with the written text, resulting in a faster and better optical character recognition (OCR) for the text input into the mobile device.
EP 10161624 discloses another camera based method for the text input and for a keyword detection, wherein the written text gets captured by the image, converted via OCR and analyzed for finding a most probable keyword therein. Such a kind of an input of the written text into the mobile device facilitates the text input for text translation applications and for internet searches about the written text in a book, for instance.
EP 08 169 713 discloses a method and a portable device such as preferably a mobile communication device for providing camera based services including the internet, capturing an image of a text page and processing the image data such that text is recognized via OCR software for a further usage within an application.
However said methods are supportive for the text input into some special applications without a need to input the text tediously character by character via the keyboard, the application needs to be adapted to the program module with one of the disclosed methods, thus the use of said methods is limited to a rather small field of applications.
Current Translator devices, such as for instance “Sprachcomputer Franklin” from Pons, “Dialogue” or “Professional Translator XT” from Hexaglot or “Pacifica” from Lingo Corporation, for example all use the keyboard input for words or sentences which shall be translated. However to input the word or the sentence in another language on the keyboard is often difficult or even nearly impossible if unknown characters as Chinese or Greek characters should be input.
An interesting approach to a multimodal input for text is disclosed in US 20110202836A1, wherein text can be input in an application of the mobile device via a keyboard or via speech, wherein the speech is recognized and converted to the input text. However, sometimes a foreign word may be difficult to spell correctly and the speech recognition has also its limitations. So, the captured image of written text with a following OCR might be seen as advantageous in many cases.