The present invention relates to a method for automatic correction of character skew in the acquisition of a text original in the form of digital scan results or values used for further processing.
In the point-by-point and line-by-line acquisition of text documents with the assistance of an electronic scanner, it is not always guaranteed that the text lines proceed exactly parallel to the scan direction. A skewed position relative to the scan direction (referred to in brief below as "skewed position") can tend to occur when the scanner is designed as an above-table camera or book scanner and the original is freely displaceable on the table or on the support. Positioning with the assistance of marks is imprecise. It is possible with the assistance of stops given single sheet originals, but is hardly possible given book originals or periodicals. Scanners having an automatic drawn-in apparatus--when this is used--can only process single sheet originals.
A skewed position is also present when the original is in fact scanned in edge-parallel manner but is itself the copy of an original that was obliquely scanned.
Although in many instances the reproduction of the scanned original is only deteriorated aesthetically due to a skewed position during scanning, it can be a considerable disruption when the scanning is followed by a structural or semantic analysis of the scan data. For example, typical character recognition methods can only stand a limited skew of the original, among other things because of the problem of isolating text lines. Over and above this, a skewed position deteriorates, complicates, or slows the scanning and raises the cost of every method for the acquisition of horizontal and vertical structures, for example dark strokes in forms, underlining in a machine-written text, or the acquisition of white borders used as criteria for bounding text and image regions. Even slight skewed positions can already have a disturbing effect when, for example, larger illustrations are to be identified as a unit. The method described below solves the cited problems given the precondition that the original has pronounced horizontal and/or vertical structures. This, however, is precisely the case for that class of originals for which a skewed position is undesired.