1. Field of the Invention
The present invention relates to a technology for automatically extracting a character from an image of a document with a high time-efficiency.
2. Description of the Related Art
FIG. 9 is a block diagram of a conventional automatic character recognition system. Conventionally, to extract a text character (including a line, hereinafter simply “a character”) included in image data acquired by an image-data acquiring unit 81 such as a scanner, a digital camera, and a facsimile, a series of operations, namely image data acquisition, image display, and optical character recognition (OCR), are performed. These operations are performed by the image-data acquiring unit 81, a display unit 84, and an optical character-recognizing unit 86.
The image-data acquiring unit 81 is controlled by an image-data-acquisition control unit 82 configured by software operated on a computer. The display unit 84 such as a monitor for recognition and the like of acquired image data is similarly controlled by a display control unit 83 configured by software. The optical character-recognizing unit 86 itself, and a character-recognition control unit 85 that controls the optical character-recognizing unit 86, are also configured by software. Unlike an earlier technology (for example, Japanese Patent Application Laid-open No. S52-102638) in which the character recognition takes a considerably long time, a character recognition technology configured by software are becoming familiar, with the help of increased speed of recent calculating unites (for example, Japanese Patent Application Laid-open No. H5-81466).
However, in the conventional automatic character recognition system described above, the control units 82, 83, and 85 are sequence-controlled by a higher control unit, namely an integrating unit 87. After the automatic character recognition step starts, the integrating unit 87 acquires image data by causing the image-data-acquisition control unit 82 to operate, and uses the display control unit 83 to make the display unit display the acquired image data. After the display is completed, the integrating unit 87 controls the optical character-recognizing unit 86 using the character-recognition control unit 85 and finally extracts the character.
In this configuration, since the image-data-acquisition control unit 82, the display control unit 83, the character-recognition control unit 85, and the integrating unit 87 constitute a single application, if the integrating unit 87 stops, the other control units also stop, and if the integrating unit 87 starts operating, according to the sequence-control executed by the integrating unit 87, “one of” the control units becomes functional. Consequently, no matter how small the load on the computer, priority cannot be given to the operation of the optical character-recognizing unit 86 that places the largest load on the calculating unit, which is inefficient.
Even when the automatic character recognition operation becomes disrupted due to an increase in the load on the calculating unit, while other software is operating on the computer, it depends on the integrating unit 87 which operation is disrupted. Generally, the optical character-recognizing unit 86 is likely to become disrupted since it executes complex operations and places the largest load on the calculating unit. Even if the operation is an image data acquisition operation which places a comparatively light load on the calculating unit and there is no disruption, since the optical character-recognizing unit 86 is likely to become disrupted, the efficiency of the automatic character recognition operation is considerably poor. This becomes even more problematic when an image-data acquiring unit such as a scanner is capable of acquiring image data at higher speed. Operations in the optical character-recognizing unit 86 should preferably be postponed, and precedence given to the functions of the image-data acquiring unit.