According to widely known methods of text pre-recognition a bit-mapped image is parsed into regions, containing text and/or non-text regions, with the further dividing said text regions into objects, containing strings, words, character groups, characters etc.
Some known methods uses preliminarily document type identification for narrowing a list of possible documents types, examined in an analysis of the document logical structure
According to this group of methods the document type identification is an independent step of document analysis, forestalling logical structure identification. At that the document type and its properties list become defined up to the moment of defining the logical structure thereof. Or wise versa, a document structure identification may be an integral part of logical structure identification process. In this case the document type that fits closer the analyzed image is selected.
A spatial orientation direction verification is present in a number of documents.
In the U.S. Pat. No. 5,031,225 (Jul. 9, 1991, Tochikawa et al.) is disclosed a method of document image spatial orientation verification, using a preliminarily assigned character, to be found in the document. The found character is recognized to fit one of the 4 models thereof, corresponding with four possible directions.
The most reliably matching model indicates the orientation direction of the image.
The method causes a mistake in the case of possible different directions of text orientation to be present in the document. It also may cause mistake if the character is not reliably recognized after converting into image state.
In the U.S. Pat. No. 5,235,651 (Aug. 10, 1993, Nafarieh) the orientation direction of the image is estimated via setting up and accepting a hypothesis on the level of initial image units by analyzing the transition from dark points (pixels) and regions to light ones and wise versa. If the examined hypothesis is not accepted, the new one is set up, considering the image to be turned at 90. degree. angle.
The method can't work if various orientation directions of text can be present on the form.
In the U.S. Pat. No. 5,471,549 (Nov. 28, 1995, Kurosu et al.) to define the image orientation direction the text characters are selected from the text one after another and are tried to recognize, supposing orientation direction to be 0. degree., 90. degree., 180. degree., 270. degree. The direction of the best matching is assumed as the right document image orientation.
The method can't work if various orientation directions of text can be present on the form as in the previous example.
In the U.S. Pat. No. 5,592,572 (Jan. 7, 1997, Le) the problem is solved by dividing the image into a large amount of objects, either Of text or non-text types. Then the orientation of all initial objects is estimated via recognition of characters, with the further joining them into large ones and estimating the orientation thereof. Finally there is the only text object, covering the whole text field with the corresponding orientation estimation.
The main shortcoming of the method lies in that the orientation estimation is performed along with recognition of text portions, thus reducing the method output.
In the U.S. Pat. No. 6,137,905 (Oct. 24, 2000, Takaoka) and U.S. Pat. No. 6,148,119 (Nov. 14, 2000, Takaoka) the orientation direction is estimated by dividing the image into a plurality of regions, possessing various estimation weight coefficient. Then the orientation direction is estimated via the text recognition in the said regions. The total direction is estimated as a sum of particular ones together with their weight coefficients.
The shortcoming of the method is the low method output, depending greatly upon the recognition results.
In the U.S. Pat. No. 6,169,822 (Jan. 2, 2001, Jung) the predetermined portion of the text is parsed from the image and is performed (processed) recognition. In the case of recognition failure, the inference is made about the other orientation direction of the image
To achieve the reliable result via the said method the large number of text portions are to be recognized. That surely reduces the method output.