The present invention relates to a method and a document reading apparatus capable of reading character and image information recorded on documents at a higher efficiency for image processing and recognition purposes.
2. Description of Prior Art
In recent years, a number of optical character readers (OCR) have been used as means for inputting information into electronic computers. Character subsets to be read by such a kind of OCR include not only the printed alphanumeric subset but also hand-written alphanumeric characters, hand-written KATAKANA characters, typed KANJI characters, and hand-written KANJI characters. Various kinds of character subsets are normally read in association with the development of the reading recognition technique.
The prior art optical character reading device is disclosed, for instance, in Japanese Patent Publication No. 60-20785 (1985).
As fundamentally shown in FIG. 1, in such a kind of conventional OCR, a document 1 is scanned under control of a document controller 2 and the character and image information written on document 1 is read out by a photoelectric transducer 3 and stored into an image buffer 4. The character and image information stored in image buffer 4 is read out by a recognition unit 5 and subjected to the character/image recognition on the segmentation, feature extraction, and the like of the characters and images. A reading controller 6 is also provided to control those units 2 to 5.
Image buffer 4 plays the significant role to efficiently couple the document readout scanning system with the recognition processing system.
As shown in FIG. 2, photoelectric transducer 3 (for example, a line sensor 3a is employed) photoelectrically transduces the character and image information (optically scanned through a lens 3c) on document 1 on a line by line basis at a predetermined resolution in the direction perpendicular to the conveying direction, which document 1 has been conveyed to a readout scanning position 3b of line sensor 3a by document controller 2. In this manner, the information (character strings) 1a and 1b written on document 1 are read out. Image buffer 4 is constituted by, for example, two RAMs (random access memories) 4a and 4b, connected in parallel. The information read out of document 1 in this manner is stored into respective RAMs 4a and 4b on the basis of a unit of, for instance, the character string of one line. After completion of the storage of character strings 1a and 1b each corresponding to one line, image buffer 4 communicates with recognition unit 5, thereby subjecting the information of character strings 1a and 1 b to the recognition processes containing the segmentation, feature-extraction, discrimination and identification, and the like of the characters.
From a viewpoint of processing efficiency, it is very disadvantageous that during the time interval when image buffer 4 communicates with recognition unit 5, the writing operation of the character and image information into image buffer 4 is interrupted. Namely, when scanning of document 1 is interrupted, the scanned information data is potentially damaged, or lost.
Therefore, as mentioned above, RAMs 4a and 4b constituting image buffer 4 are operated in a parallel structure, thereby allowing the writing operation by the readout of the document and the reading operation for the recognition process to be alternately executed in parallel.
On the other hand, requirements to improve the reading performance for such kinds of OCR are even-increasing. For example, these requirements include not only an increase in character categories to be read out, but also an increase in the degree of freedom in writing hand-written characters (namely, a degree of freedom in modification of character styles), liberalization of the document formats, realization of a high data processing speed and the like. However, the conventional OCR as described above has the following problems.
First it is apparent that the time for requiring recognition of the character and image information stored in image buffer 4 varies considerably, depending on the character categories. That is to say, the printed alphanumeric characters and printed KATAKANA characters can be relatively simply recognized at a higher speed; conversely, in the case of the hand-written KANJI characters, a long time is required for the recognition process, since the character pattern structure is complicated, as well as there are many character subsets and similar characters.
Such a fact can be seen by example of the document shown in FIG. 3. In this case, document 1 contains character strings having different character subsets, namely, KATAKANA characters, HIRAGANA characters, KANJI characters, numerals, Roman characters, and a map. These characters are sequentially arranged in accordance with the scanning order perpendicular to the scanning direction and are scanned at a constant speed. As a result, the recognition time of the character and image information is necessarily prolonged as compared with the reading time. In this case, even if RAMs 4a and 4b of image buffer 4 shown in FIGS. 1 and 2 are parallel-connected, the readout operation of document 1 must be temporarily interrupted. This is because no further readout data can be stored in both RAMs 4a and 4b, resulting in a lower processing efficiency.
Moreover, as shown in FIG. 3, if a step 1c exists between the lines of the character strings (KATAKANA and HIRAGANA characters), these characters cannot be alternately written into two RAMs 4a and 4b in such a manner that the character string of each line is separately written as a unit. In such a case, for example, there is another disadvantage such that the simultaneous write control is required for both image buffer RAMs 4a and 4b. On the other hand, in order to simultaneously write the character information into image buffer RAMs 4a and 4b, there is also another problem that the scan of document 1 needs to be interrupted until the recognition process for the character and image information stored in RAMs 4a and 4b is completed.
Secondly, if the character strings are formated in the same direction as the document feed direction of document 1 as shown in, FIG. 4, the foregoing readout control cannot be applied thereto. In general, the buffer memory capacity of image buffer 4 is designed such that the information of the character string written in one line can be sufficiently stored with a desired accuracy necessary for the recognition process. However, when document 1 is fed with a skew in the document feed direction, the readout area of the character and image information of one line is out of the image buffer size, so that all information of the character string of one line cannot be stored.
To prevent such a problem, according to the conventional OCR, an amount of skew is detected in advance by the edge portion of document 1 to be conveyed. If the skew amount exceeds a predetermined value, the transportation of this document is regarded as an error and thus an instruction is given to the operator to re-enter the document into the OCR. However, the execution of such measures impedes the processing efficiency when continuously reading a large quantity of documents.
Thirdly, as a method of continuously processing a plurality of documents 1, the document convey paths are switched in accordance with the result of the recognition, and documents 1 are sorted and collected. In general, to switch over the document convey paths, documents 1 are continuously conveyed with a predetermined time interval between the successive documents to be continuously fed. This document feed time interval is not negligible, compared with the length of document 1.
In the prior art OCR, the period of time required to convey document 1 by the distance of the sum of the length of document 1 and the document feed interval may be set as the processing unit time for a single document (namely, the time longer than the unit time necessary to process only one document). In spite of such compromise, in the conventional OCR, image buffer 4 is controlled as mentioned above. Therefore, the time which can be allocated for the recognition process must be defined by the time necessary to scan document 1. Thus, the defined processing unit time cannot be effectively used, resulting in a long idle time.
As described above, the conventional OCR has various problems that hinder improvement of the document reading and recognition efficiencies.
To solve such drawbacks, one solution has been proposed that instead of performing the line-to-line recognition control by line buffer RAMs 4a and 4b, a page buffer memory having a capacity sufficiently to cover the entire document size, is employed.
However, when all of the information contained in the document is written into such a page buffer memory, there is another problem that not only the slow reading scan is necessarily required, but also very complicated processes need to be executed to segment the desired character strings from the information. Accordingly, a high-speed process cannot be expected.
The above-described problems of the conventional OCR will now be summarized as follows.
First, it is obvious that the image buffer memory in this kind of OCR has the significant function as a buffer for matching the scanning unit (2, 3) with the recognition unit (5).
In the OCR employing two line buffer memories alternately operable in parallel, the time required for the information recognition is greatly affected by the influences of the degree of freedom in the writing operation, as well as the document format, and skew.
Conversely, the above problems may be solved to some extent by use of the image buffer memory having capacity sufficient to cover a document of the maximum size. However, another drawback then occurs. All of the unnecessary information written on the document must be scanned and stored while at the same time, the necessary information needs to be segmented from the entire information. As a result, the whole processing time is prolonged and a high-speed reading process cannot be expected.
Therefore, there is a need for an optical character reader with a relatively small capacity buffer memory that can execute, with the high performance, for example, the recognition of hand-written KANJI characters under a constant document feed, as can be realized by only the conventional high-performance OCR.
The present invention is made in consideration of such circumstances and an object of the invention is to provide an apparatus for reading characters and images in which the degree of freedom in design of the document format can be improved, the fluctuation in recognition processing time for various kinds of character categories can be absorbed, and the document can be efficiently processed at a high speed.
More specifically, another object of the invention is to provide a document reading apparatus by which a plurality of documents can be continuously fed during the reading process at a substantially constant feeding speed, even if these documents contain hand-written characters and/or images that take much time for recognition.
Still another object of the invention is to provide a document reading apparatus which employs simple recognition arrangements, even if a plurality of documents are substantially constantly fed in the reading process, because the image memory operable under the scroll control can function as a buffer or damper memory.