1. Field of the Invention
The invention discloses a method for photographing a file to obtain images and providing the images to an optical character recognition (OCR) software to recognize characters, thus to generate and output the characters and, more particularly, to a method for photographing a file by a camera, integrating the continuous dynamic images obtained by the camera, that's the video stream, and providing the integrated image to the OCR software to recognize the characters and output the characters.
2. Description of the Prior Art
Products related to the OCR technology provide a subsidiary function for a scanner. A user needs to put a file on a scanner table, and the scanner scans each single page completely. Afterwards, the scanned image is further analyzed via OCR software to abstract words on it. If the word, sentence or a paragraph is separated on two different pages. User has to maintain the order and reassemble the separated word, sentence or paragraph manually.
With the widespread of a handheld device, a skilled person in the art tries to use the OCR technique in a handheld device. In this aspect, we may mention the following situation. First, most users only need some characters or sentences of the scanned file (and the characters are preferably translated, for example, some restaurant menus are printed in French). Second, the characters are arranged in line and consecutive due to the characteristic of the characters, and people also read texts line by line. Therefore, a scanning pen is developed. However, to meet the characteristics of arranging line by line and consecutively, the input interface (video lens) of the scanning pen is a line camera. The line camera considers the two-dimensional characters as a combination of a group of line segments, and it reads and inputs the line segments into the system to combine them and reassemble them to a two-dimensional image. Thus, the OCR software may process the image.
However, in the popular handheld devices (such as a mobile phone), a two-dimensional camera module is disposed. The camera module inputs a series of consecutive two-dimensional images to the system. That is, the input mode of the camera module is similar to that of a tablet scanner. Thus, in use, the images are processed page by page, respectively, which is not in accordance with the people's habit of processing the characters line by line. Thus, the OCR technology is mostly used in business card recognition (BRC) currently. If the consecutive line-by-line input is wanted, additional hardware assistance is needed (such as the technique in patent CN2745288Y).