The present invention relates to processing electronic documents. In particular, the present invention relates to processing electronic documents to extract information from the document.
A large amount of electronic documents are prevalent today throughout organizations and on the internet. These documents contain useful informational elements for a number of different purposes. For example, a purchase order will contain product and price information. Likewise, a fax will include a sender, a recipient and a subject. Additionally, documents can be classified according to various document types such as letters, resumes, memos, reports, recipes, fliers, magazines, etc. Informational elements associated with a document such as classification, recipient, subject and/or product number can be identified and/or extracted by manual examination of the document. While manual examination is effective for a small number of documents, examination can become time consuming and costly for extracting informational elements associated with a large number of documents.
One particular application for identifying informational elements in a document is identifying a recipient in a fax document. Fax machines are found throughout businesses today for transmitting and receiving documents. Businesses typically have a single fax number for a plurality of employees. To send a fax document, a transmitting fax machine scans the document to form an image and transmits the image to a receiving fax machine. The receiving fax machine prints out the document, where it can then be routed to the correct recipients by a simple manual examination of contents of the fax.
Alternatively, a growing number of incoming faxes arrive at computers equipped with fax modems or through an internet fax service. When a fax document is sent to a computer as an electronic document, the fax can be routed to the correct person over a computer network, for example by attaching the fax to an e-mail message addressed to the recipient. To route the fax document, a user examines each fax document to identify the correct recipient and then routes the document to the recipient via e-mail.
In companies that receive thousands of faxes per day, the expense and time for routing a fax to the correct recipient can be extremely high if manual examination and routing of each fax document is required. Thus, an automatic system for processing fax documents to identify the correct recipient and route the fax document based on the identified recipient would address problems associated with manually examining and routing fax documents. Additionally, automatically extracting information from and associating electronic documents and/or portions thereof with informational elements will aid in classification of documents, identifying informational fields and searching documents, for example.