For purposes of electronic storage, form processing, and/or for electronic document transmission, paper documents are often optically scanned and then processed. Scanning of documents often results in a set of binary, e.g., black or white, pixel values, representing the scanned image. In most cases where printed documents are scanned, the text will be black or interpreted as black with the background being white. In such cases, black pixel values are interpreted as corresponding to the foreground and white pixel values corresponding to the background. For example, a black pixel value may be represented by a “1” pixel value and a white pixel value represented by a “0” pixel value.
As part of the scanning process one or more sheets of paper may be scanned. Unfortunately, as part of the scanning processes and/or as the result of previous copying errors one or more sheets of paper may be and often are scanned upside down. Scanned images which are upside down are sometimes referred to as being inverted. The presence of one or more inverted pages in a scanned document make the inverted images, e.g., pages of inverted text difficult to read and can interfere with some image processing techniques commonly applied to scanned documents such as, for example, optical character recognition used to recognize text words in a scanned document. In many instances such as the aforementioned, it is desirable to analyze the digital image of a scanned document and determine if one or more pages should be inverted to place the image, e.g., page, in a right-side-up orientation. To do this requires the ability to determine whether a text image is upside down or right side up.
While a stack of papers, e.g., pages, forming a multipage document including text may be scanned in a single scanning operation, the stack of pages may include sheets that have different orientations. This may occur if, for example, one or more pages of the document were put in the scanner upside down. Thus, it should be appreciated that it is desirable to be able to identify the orientation of each page of an image and be able to re-orient the pages such that all pages of the document are oriented in the same manner, for example in right side up manner such that a user viewing the document can read each page of the document without having to re-orient individual pages of the document.
The detection, e.g., identification, of the orientation of an image or a portion of an image including lines of text in a generally reliable manner is particularly important if subsequent image processing, e.g., optical character recognition, is to occur which takes into consideration the orientation of the text on the image such as in optical character recognition. It should be appreciated that since the re-orientation process will normally precede optical character recognition in many applications, it is desirable that the orientation determination process need not rely on the recognition of individual characters or words and be able to proceed prior to reliable recognition of the actual text words or phrases in the document being scanned. It should also be noted that it is desirable that the orientation determination process be capable of being performed independently from page to page since the orientation of one page in a scanned document does not guarantee that the next scanned page will have the same orientation.
In view of the above discussion, it should be appreciated that there is a need for methods and apparatus for identifying the orientation of a scanned image including lines of text and/or for performing image processing operations on an image or portions of an image based on information regarding the orientation of the scanned image.