1. Field of the Invention
The present invention relates generally to the field of document scanning wherein document pages are encoded digitally for storage and reproduction. More particularly, the invention relates to a novel technique for analyzing and organizing scanned pages to detect erroneously fed or scanned pages and to place the scanned pages in a desired order with a minimal degree of operator intervention.
2. Description of the Related Art
Digital document scanners have become a wide-spread tool in many document handling and production tasks. At present, digital scanners for encoding text and images are utilized in office environments both for the storage of documents, the transmission of documents, such as in facsimile machines, and the copying of documents, such as in digital copiers, and so forth. Scanners may include both hand-held devices, sheet-feed devices, and fill page devices. Moreover, full page devices may include automatic sheet-feeding arrangements for drawing a stack of documents to be scanned over a scanning surface in batch processes.
Digital scanners of the various types mentioned above generally include light sources and light detectors arranged to reflect radiation from a surface to be scanned, and to receive reflected radiation. The reflected radiation is then encoded to provide data representative of discrete picture elements or pixels on the document surface. The light source and receiver elements may provide for either single-color scanning or multiple-color scanning. Moreover, various resolutions are available, each dividing the scanned image into different numbers of pixels and different spatial densities of pixels in an image matrix. Once the image has been scanned, software routines are employed for analyzing and reconstructing an image, such as on a computer monitor, printer or copier, and for transmitting the image data, such as in a facsimile transmission or data file.
Where multiple page documents are scanned in batch processes, an operator may control the accurate duplication and collation of the scanned pages by manually placing the pages on the scanner one-by-one, or by manually monitoring the scanning process. However, where automatic sheet-feeders are employed, an operator is often freed to pursue other tasks while the scanner sequentially draws in the pages from a stack and scans them in the stack order. Certain scanners, such as in copying machines, are also available with automatic collating features for two-sided copying wherein pages having useful information on both sides are flipped in sequential scanning operations and either single or double-sided reproductions are produced.
Several problems arise in automatic sheet-feeding scanners. For example, in single-side scanners, when two-sided documents are to be scanned, an operator must process the batch job on a first or recto side, then reinsert the stack for scanning on the second or verso side, and subsequently manually collate the pages to interleave the odd and even pages. Where the scanner is employed for facsimile transmissions and similar operations, the receiver must perform this collating operation. In the event of xe2x80x9cmisfeeds,xe2x80x9d in which the scanner draws in more than one page at a time, the operator may receive a series of scanned pages which will not properly interleave due to the absence of one or more pages. The latter problem occurs not only in reproduction of double-sided documents in a scanner, but can result in the failure to fully scan even single-side documents. In either event, the operator is faced with the time consuming task of sorting through the scanned pages to identify the missing pages, re-scanning the missing pages, and inserting the missing pages in the appropriate location in the document.
There is a need, therefore, for an improved technique for scanning and managing documents which permits misfed pages, or absent pages, to be identified easily with a minimal amount of operator intervention. There is also a need for a technique which can interleave or reorganize scanned pages, such as two-sided pages scanned in a single-side batch job, while flagging mismatching or missing pages in a sequence, also with a minimal degree of operator intervention.
The present invention provides a method and apparatus for scanning multiple page documents designed to respond to these needs. The technique may be employed on any type of scanner, including sheet-feed scanners or full page scanners, multi-function printers, copying machines, facsimile machines, and so forth. The technique permits a user to encode a series of pages, either single-sided or double-sided. Following the scanning process, data representative of the pages is analyzed to verify the order of the scanned pages, or to flag missing pages in a batch job. In the case of two-sided documents, the technique facilitates identification of misfeeds of either the recto or verso sides of the documents. The operator may then be notified of a misfeeds, and re-scan any missing pages. A similar technique is employed for single-sided documents. The technique permits automatic interleaving of scanned pages of two sided documents, as well as insertion and re-ordering of pages in both two-sided and single-sided documents. Moreover, the technique may employ character recognition devices, such as optical character recognition, to identify page designations where these are present on one or more of the pages. The locations of the page designations may be automatically determined or may be input by an operator. Misfeeds and interleaving may proceed based upon the recognized page designations. Where such character recognition techniques are employed, they may be used to verify that sections of batch jobs are presented in a uniform page orientation. The recognition may then prompt reorientation of certain pages in the batch job to provide consistency in the scanned data and page presentation.
In addition to the basic interleaving and misfeeds recognition functions, the technique may be adapted to provide some degree of tolerance in handling of the scanned pages. For example, tolerances may be provided for pages which are considered to be first or last pages in a sequence, particularly where optical character recognition routines are employed to identify the proper page order. Similarly, anomalies in page numbering may be permitted, particularly for pages in a sequence where the number of pages between recognized sequenced pages is proper but no page designation is found or recognized.