The present invention relates generally to intelligent document recognition and handling. Specifically, the invention relates to using image processing techniques to automatically determine a task to be performed for an electronic document in a computer. More specifically, the invention relates to using image processing techniques to automatically determine a document type and a filing location for an electronic document in a computer.
Electronic documents are documents that are created and saved on a computer or are printed documents that have been digitized or scanned into a computer. A user who relies upon electronically stored documents has the same challenge as one who relies upon printed documents in that knowing where documents are kept is as important as having copies of the documents in the first place.
When storing a printed document in a filing cabinet, a user typically first determines the type of document it is or what the subject matter of document is, and then determines the proper storage location for the document. When a user decides that there are several possible storage locations for that document, a user typically picks the "best" location for the document or otherwise makes copies of the document for each possible location. A drawback with saving only one copy in one location is that when another user attempts to locate that document that person might not look in the same location. A drawback with making copies for the several possible locations is the extra cost of making copies and of extra storage space.
Electronic documents share the same problems as printed documents as described above: one person's decision about where to store a document may change over time and is often different from a second person's decision. If a user stores only one electronic copy of a document, that document might be "lost" to the second person. Although storage and copy costs for electronic documents are minimal, a drawback to making duplicate electronic copies for the several possible locations is that any annotations, notes, or subsequent corrections made onto one copy typically are not carried over onto duplicate copies. A second person thus may not see, a "credit hold" note appended to one copy of the document, for example.
One method of helping users locate electronic documents has been to enable users to store document information in a summary sheet appended to a document. Such a feature is now common in many word processing programs. For example, a user can store his name, information about the document type, and other "keywords" related to the document in a cover sheet attached to the document. If a user subsequently needs to access a particular document, the user can search "keywords" or other information on existing documents' cover sheets to locate the document. A drawback to this approach is that a user may choose different "keywords" over time to identify the same subject matter, i.e. use "keywords" inconsistently. Further, in a multiple user system, different users will likely choose different "keywords" to identify the same subject matter. The result in a multiple user system is that documents and files might again become "lost" to subsequent users.
Another well known method developed to help users organize files is to provide a rigid directory and sub-directory structure. With a rigid directory structure, related files are stored according to a planned filing scheme. A problem with this method, however, is similar to the problem described above. Different users will decide that there are different "correct" locations in which to file a document. Further, this approach requires the user to remember the entire directory structure to determine where to store and retrieve documents.