A conventional document conversion system scans printed documents and converts the documents into digital data. The digital data can then be stored by computer in an appropriate file. For example, printed text documents are converted and stored into text files, such as ASCII files. Likewise, printed graphic image documents are converted and stored into bitmap files such as Tagged Image File Format (TIFF) files. The converted documents can then be copied, edited, transferred, displayed and otherwise maintained as digital data.
Quality control is essential in a document conversion system. In the case of text documents, each printed character must be recognized using optical character recognition techniques. Optical character recognition is imperfect, however, and its imperfection increases when text documents of lower print quality are scanned. For example, the number "1" may be converted into the letter "1". Similarly, in the case of graphic image documents, graphic conversion errors may occur during conversion, such as raster errors. The quality of the converted image is compromised as a result. Thus, quality control is necessary for both text documents and graphic image documents to ensure that the documents are converted accurately.
Conventional quality control in document conversion is a manual process. After each printed document is scanned, converted and stored in a file, the file is printed out and compared to the original printed document by a human operator. The operator then records the number of conversion errors that occurred in converting each printed document. Unfortunately, printing out each file is very time-consuming. This becomes a severe problem when large batches of documents are converted.
Furthermore, to determine the accuracy of the conversion process, the operator must manually calculate a conversion accuracy percentage. In the case of text documents, for example, the operator must count all of the characters in the documents, count all of the erroneously scanned characters, and then determine the conversion accuracy percentage based on these values. This is also very time consuming, especially when large batches of documents are converted.
Still further, conventional quality control employs 100% manual inspection. That is, every single scanned document is printed and the printout is visually examined by the operator. This is a very cumbersome process. Moreover, the method is fully dependent on the human operator to calculate the conversion accuracy percentage correctly. As a result, the process is quite error-prone. Thus, a more efficient method is needed for providing quality control in a document conversion system.