This invention relates to a document orientation discrimination apparatus and method for discriminating the orientation of a document based upon image data obtained by optically reading the document. The invention relates also to a character recognition apparatus and method for recognizing characters in the image data.
A reading device used in character recognition processing to optically read a manuscript in accordance with the prior art acquires image data using a so-called scanner and subjects the image data to character recognition. If the image data happen to be read upon being rotated by 90 or 180.degree., completely different codes are outputted as the result of character recognition. Though the acquired image data does undergo character recognition, the orientation of the characters is incorrect, as a consequence of which the results of character recognition are rendered incoherent.
Accordingly, in order that character recognition may be performed correctly when the orientation of the document is improper, a person corrects the direction in which the manuscript is read and re-enters the manuscript so that recognition processing will be performed correctly. However, it is now required that the apparatus be provided with a function for automatically discriminating/rotating the orientation of the document for the following two reasons: (1) Since scanners execute processing at higher speeds and have begun to be equipped with an automatic document feeding function referred to as an autofeeder, there has been an increase in the processing of large numbers of manuscripts, and it is difficult for an individual to correct manuscript orientation one manuscript at a time. (2) In the case of a scanner for size A4 paper, there is only one way the manuscript can be placed in the apparatus.
FIG. 26, consisting of FIGS. 26A and 26B, is a diagram for describing a typical technique used to discriminate document orientation automatically. FIG. 26A illustrates a method according to which the results of area separation are used to extract a section 1000 having lines, as in a chart or table, the orientation of the section is observed (as by using the fact that the section is divided by long lines in the transverse direction) and the orientation of the document is recognized based upon the observation made. In another method, shown in FIG. 26B, projection (histograms 1001) in the longitudinal and transverse directions are detected, and orientation is judged by observing the breaks in the histograms (e.g., the shorter breaks in the histogram are adopted as being indicative of the transverse direction). Alternatively, the document is separated into areas, and the orientation of the document is discriminated based upon the transverse and longitudinal lengths of rectangular areas 1002 matched to the features of the character areas.
Whether the document is oriented in the transverse direction or longitudinal direction is discriminated based upon the results of discrimination obtained by the methods described above. The image is rotated as necessary. The rotated image is then subjected to character recognition processing, and the results of recognition are obtained.
Expectations for character recognition have risen in recent years owing to the growing demand for the handling of greater quantities of documents. Character recognition units are now used in electronic filing and desktop publishing and are installed in apparatus such as copiers where large quantities of documents are processed. The character recognition apparatus makes it possible for characters contained in a document written on paper to be utilized in retrieval and to be processed by desktop publishing software.
Thus, as mentioned above, various automating techniques that do not require human intervention have been developed for character recognition apparatus. Techniques for automatically correcting document orientation are particularly important.
The conventional character recognition apparatus described above has the following drawbacks:
(1) Characters undergo recognition erroneously when the document has been entered in the wrong orientation. PA1 (2) In a case where read image data are oriented on their side or upside down, confirmation by monitor or the like is difficult. PA1 (3) Discrimination of document orientation is not accurate enough in general. PA1 (4) Discrimination of orientation is not accurate enough, in particular, with regard to documents in which characters having different orientations are mixed.
These difficulties will now be described in simple terms.
(1) Occurrence of erroneous recognition due to difference in character orientation
FIG. 27, consisting of FIGS. 27A and 27B, is a diagram illustrating results of recognition in various directions in a case where the character "M" has had its reading direction rotated. It should be noted that FIG. 27 is merely an example and that the results of erroneous recognition are not necessarily as illustrated. FIG. 27 illustrates instances in which recognition is erroneous or is incapable of being performed. More specifically, FIG. 27B shows "E" as the result of recognition in a case where the reading direction of the character "M" has been rotated by 270.degree., and FIG. 27C shows "W" as the result of recognition in a case where the reading direction of the character "M" has been rotated by 180.degree.. FIG. 27D shows "Z" as the result of recognition in a case where the reading direction of the character "M" has been rotated by 90.degree.. Thus, character recognition is performed on the assumption that characters are oriented correctly, and character candidates are selected from the features obtained by recognition. If the reading direction is rotated, therefore, the results of recognition will be erroneous.
(2) Occurrence of difficulties when confirmation of image data is performed by monitor
FIG. 28, consisting of FIGS. 28A and 28B, is a diagram showing the manner in which image data, which have been read in by a scanner or the like, are displayed on a screen. FIG. 28A shows an example of a display in a case where a document of size A4 in the longitudinal direction has been read upon being placed longitudinally. Here the display is normal. FIG. 28B shows an example of a display in a case where a document of size A4 in the longitudinal direction has been read upon being placed on its side. When viewed by a person, the document appears as an image that has been rotated by 90.degree.. This happens because of the relationship between the manner in which the document is written on paper (the document orientation) and the manner in which the manuscript is placed when the image thereof is entered from the scanner.
FIG. 29, consisting of FIGS. 29A and 29B, is a diagram for describing various dispositions of documents on paper. FIG. 29 describes various ways in which a document is arranged on paper. FIG. 29A shows an A4 document arranged longitudinally. This is a document form often used for writing Japanese characters horizontally and for writing English documents. FIG. 29B shows an A4 document on its side. This is a document form in which the individual lines are long, a document form often used when making a reduced copy of a size A3 or B4 document. FIG. 29C shows an A4 document turned on its side and divided down the center. This is a document form often used in a case where reduced-size copies are made of two size A4 documents in continuous fashion. FIG. 29D shows a vertically oriented A4 document in which characters are written vertically.
Scanners employ a variety of reading methods depending upon the type of machine. Examples of scanners are a flat-bed scanner in which manuscripts up to size A4 can be entered, and a scanner of the type in which a size A4 manuscript is slid and read in longitudinally. In such scanners the manuscript reading direction is uniquely decided. Depending upon the way the document is placed, therefore, the document orientation may be read in improperly.
There are also systems in which a manuscript is read by utilizing the scanner of a copier. Such a scanner offers a comparatively high degree of freedom in regard to how the manuscript to be read is placed. As a result, it is possible for an individual to place the manuscript in the correct orientation when entering the image. In particular, when a document having a large number of pages is read, it is possible for the manuscripts to be fed in automatically by an autofeeder and then read. However, when a document is introduced using an autofeeder, some images will be entered in the improper orientation if some pages in the document are improperly oriented or if documents having different text layouts are included.
A display having an abnormal orientation will result, as shown in FIG. 28, consisting of FIGS. 28A and 28B, for the reasons set forth above. In such case it will be necessary to rotate the image to the correct orientation.
(3) Accuracy of document orientation discrimination
A high degree of accuracy is essential to discriminate the orientation of a document. Judgment using the lines of a table or chart contained in a document as in the aforementioned example of the prior art can result in mistaken discrimination of direction if a document does not have a table or chart or if the document contains mixed horizontal and vertical lines. In the case of the longitudinal and horizontal projections, direction of rotation can be detected comparatively accurately if the document has characters only and lines or paragraphs are clearly defined. In case of a document containing figures or natural pictures, there is the possibility that orientation will be discriminated incorrectly. Furthermore, it is difficult to distinguish between 0.degree. and 180.degree. and between 90.degree. and 270.degree., so the accuracy of orientation discrimination is poor.
(4) Occurrence of erroneous recognition of orientation in document having mixture of characters of different orientations
FIG. 30, consisting of FIGS. 30A, B and C, is a diagram showing examples of documents in which one page of the manuscript contains a mixture of characters having different orientations. FIG. 30A illustrates a document having characters in the normal direction and characters in a direction different from the normal direction. Here the document contains characters describing a graph 1010. FIGS. 30B and C illustrate documents in which a single page of the manuscript is obtained by reducing the size of two pages of an original. One side of the document has characters arranged vertically and other side of the document has characters arranged horizontally. Here the results of judging orientation differ depending upon which orientation of the characters in the document is used to judge the orientation of the document.
In a case where the conventional character recognition apparatus described above is used independently to recognize characters of a plurality of types in different languages, each language cannot be recognized correctly owing to differences in the characteristics of the languages. For example, if the letters of the alphabet undergo character recognition using an OCR dedicated to the Japanese language, lowercase alphabetic characters cannot be recognized since the characteristics thereof are so much different from those of the Japanese language.
Accordingly, in order for a plurality of languages to be recognized by a single reading device, a recognition algorithm is provided for each language and the user employs an input unit to switch among the recognition algorithms corresponding to the languages. This allows highly precise character recognition to be performed. Further, it is required that a dictionary of each language be stored in the device even for one and the same algorithm. Whenever recognition is carried out, the user employs an input unit such as an operation panel to switch among the dictionaries of the languages corresponding to the characters that are to undergo recognition. This allows the characters of each language to be recognized. Furthermore, control for switching among the dictionaries is required.
However, the aforesaid method of applying character recognition to languages of a plurality of types while the user employs an input unit such as an operation panel to switch among the dictionaries of the various languages demands considerable labor of the user and slows down processing speed.
Further, if a read manuscript consists of a plurality of pages, character recognition is performed using an autofeeder in order to reduce the labor otherwise required with manual feeding of the pages of the manuscript. If the pages of the manuscript contain a mixture of pages in English and pages in Japanese, the user must enter a command whenever one page of the manuscript is read in. This not only detracts from the advantage of using an autofeeder but also ultimately results in lower processing speed.