This invention relates to a computer, method, and system for identifying a document.
The progress of digitization is accompanied by an increase in opportunities to use electronic applications. In the electronic applications, not all subject documents are digitized, and there are often used, documents made of paper or images created by scanning paper.
In this specification, a document converted into an electronic form is referred to as “electronic-based document”, and not only a document made of paper but also an image created by scanning paper is referred to as “paper-based document”. The electronic-based document and the paper-based document are also referred to simply as “document” when not distinguished from each other.
For example, a procedure performed in work involving receipts and disbursements in a company includes: (1) receiving, by an applicant, a bill from a biller; (2) submitting, by the applicant, a bill for instructing to pay a billing amount to the biller, to a finance department through an electronic application; and (3) paying, by the company, the billing amount to the biller when a person belonging to the finance department determines that the electronic application is proper after verifying an attribute described in the bill.
In this case, the attribute represents a subject of examination, and also represents a character string indicating a feature of the document. For example, in a case of the work involving receipts and disbursements, the attribute corresponds to a charge amount, a bank account number of a transfer destination to which the charged amount is to be transferred, or other such information.
In a case of an electronic application handling a paper-based document, it is required for a person to verify the paper-based document, which raises a problem in that efficiency of work is low and a cost required for the work is large.
In order to solve the above-mentioned problem, there is known a method of reading the attribute from the paper-based document through use of an optical character recognition (OCR) technology. For example, the technology described in U.S. Pat. No. 8,630,949 B2 is known.
U.S. Pat. No. 8,630,949 B2 includes the description “A method of electronically presenting bills for a customer, comprising: . . . receiving an electronic bill and a paper bill for the customer . . . ; scanning the paper bill . . . to generate electronic image information; extracting first optical character recognition (OCR) data from the electronic image information; searching the first OCR data for at least one numeric identifier of a type of the scanned paper bill; identifying the type of the scanned paper bill by comparing the at least one numeric identifier; extracting second OCR data from the electronic image information using a template corresponding to the identified bill type; extracting billing information from the second OCR data; comparing the extracted billing information with known information; . . . combining the electronic bill and the extracted billing information into a customer bill presentation; and presenting the customer bill presentation”.
Through use of the technology described in U.S. Pat. No. 8,630,949 B2, it is possible to reduce man-hours in a process performed by persons for electronic application work, to thereby be able to improve the efficiency of work and reduce a cost required for the work.