The present invention relates to a document processing system, and more particularly to a document processing system suitable for reading characters or a document of a tabular form.
In a prior art optical character reader (OCR), it is necessary to print a read region in a color undetectable by the OCR (dropout color) and hence a print cost is raised. Further, in using the OCR, it is necessary to indicate the character read region by a distance from an edge of the document and designate the number of characters in the region, a character set and check formula. This is troublesome. The positional information of the character read region, the number of characters in the region, the character set and check formula are called format information. The character recognition function is required not only in a stand-alone OCR but also in a document file and an office automation (OA) work station, but the above problems have blocked a wide use thereof.
Japanese Patent Unexamined Publication No. 58-207184 (published on Dec. 2, 1983) discloses a method to eliminate a fixed pattern stored in a memory from an input image, and a method for discriminating a type of document by using the fixed pattern. In this method, however, a memory capacity increases because the image is stored in the memory, and distortion of document (warping, rotation or positional shift) cannot be exactly compensated.