1. Field of the Invention
This invention relates generally to entering data into computer systems, and relates particularly to secure data entry from source document images.
2. Description of Prior Art
Data on millions of documents are entered into computer systems every year. The documents may include machine printed and/or hand-written data.
The data are entered on behalf of banks, insurance companies, financial advisors, tax preparers, legal firms, security firms, mortgage brokers, credit card issuers, wholesalers, distributors, retailers, transportation companies, utilities, health care providers, employers, schools, churches, nonprofit organizations, clubs, individuals, governmental entities and other organizations. The data are entered by domestic and offshore employees, temporary workers and outsourcing firms.
Data entry can be primarily clerical in nature, such as in inputing information on magazine subscription forms. Data entry can also be an essential portion of larger technical tasks, such as preparing income tax returns, processing mortgage applications or handling insurance claims.
There are three general methods of performing data entry: conventional, outsourcing and automation.
Conventional data entry, the first method, requires workers with specific education, domain expertise, particular training, software knowledge and/or cultural understanding. Data entry workers must recognize documents, find relevant information on the documents and enter the data appropriately and accurately in particular software programs. Such manual data entry is complex, time-consuming and error-prone. As a result, the cost of data entry is often quite high; this is especially true, for example, when the data entry is performed by lawyers, accountants, physicians and other highly paid professionals as part of their work.
Conventional data entry also exposes all documents in their entirety to data entry workers. These documents may have sensitive information related to individuals' and organizations' financial, tax, health, insurance, employment, education, family, legal and/or other matters.
The second method, outsourcing, requires the same worker education, expertise, training, software knowledge and/or cultural understanding. As with conventional data entry, data entry workers must recognize documents, find relevant information on the documents and enter the data appropriately and accurately in particular software programs. As with conventional data entry, outsourcing is manual and, therefore, complex, time-consuming and error-prone. Outsourcing firms such as Accenture, EDS, IBM, Infosys, Tata, and Wipro, often reduce costs by offshoring data entry work to locations with low wage data entry workers. For example, data entry of US tax and financial data is a function that has been implemented using thousands of well-educated, English-speaking workers in India and other low wage countries.
The first step of outsourcing requires organizations to scan financial, tax, health and/or other documents and save the resulting image files. These image files can be accessed by data entry workers via several methods. One method stores the image files on the source organizations' computer systems the data entry workers view the image files over networks (such as the Internet or private networks). Another method stores the image files on third-party computers systems; the data entry workers view the image files over networks. An alternative method transmits the image files from source organizations over networks and stores the image files for viewing by the data entry workers on the data entry organizations' computer system.
For example, an accountant may scan the various tax forms containing client financial data and transmit the scanned image files to an outsourcing firm. An employee of the outsourcing firm reads the client financial data and enters it into an income tax software program. The resulting tax software data file is then transmitted back to the accountant.
Quality problems with offshore data entry work have been reported by many customers. Outsourced service providers address these problems by hiring better educated and/or more experienced workers, providing them extensive training, entering data two or more times and/or exhaustively checking their work for quality errors. These measures reduce the cost savings expected from offshore outsourcing.
The cost of offshore labor is going up as demand increases for their services. Indian employers report 20% and greater salary increases over the past year.
Outsourcing and offshoring are accompanied with concerns over security risks associated with fraud and identity theft. These security concerns apply to employees and temporary workers as well as outsourced workers and offshore workers who have access to documents with sensitive information.
Although the transmission of scanned image files to the data entry organization may be secured by cryptographic techniques, the sensitive data and personal identifying information are in the clear, i.e., unencrypted, when read by data entry workers prior to entry in the appropriate computer systems. Data entry organizations publicly recognize the need for information security. Some data entry organizations claim to investigate and perform background checks of employees. Many data entry organizations claim to strictly limit physical access to the rooms in which the employees enter the data; and such rooms may be isolated. Additionally, employees may be subject to inspection to ensure that nothing is copied or removed, and paper, writing materials, cameras or other recording technology may be forbidden in the rooms. Such seemingly comprehensive security precautions are primarily physical in nature, and they are imperfect.
Lapses in physical security can occur. For example, Social Security numbers and bank routing numbers are only nine digits; bank account numbers are usually of similar length. Memorization of these important numbers would not be difficult and would allow a nefarious employee to have direct access to the money held in those accounts; in 2004, employees of MphasiS in Pune, India stole $426,000 from Citibank customers. The owners, managers, staff, guards and contractors of data entry organizations may misuse some or all of the unencrypted confidential information in their care. Further, breaches of physical and information system security by external parties can occur. Because data entry organizations are increasingly located in foreign countries, there is often little or no recourse for American citizens victimized in this manner.
For five consecutive years, the Top Technology Initiatives survey of the American Institute of Certified Public Accountants (AICPA) identified information security as the technology initiative expected to have the greatest effect in the upcoming year. Laws have been enacted and new legislation and regulations have been proposed to address these security concerns, particularly those related to outsourced data entry that is performed offshore.
The third general method of data entry involves partial automation, often combining optical character recognition, human inspection and workflow management software.
The first step of automation is to scan financial, tax, health and/or other documents and save the resulting image files. The scanned images are compared to a database of known documents. Images that are not identified are routed to data entry workers for conventional processing; images that are identified have data extracted using optical character recognition (OCR.)
Optical character recognition is not without errors, often mistaking one percent or more of the characters. Such an error rate is often unacceptable as it would result in more than six mistakes on a typical US personal income tax return with more than 100 fields of data averaging more than six letters and/or digits each.
Human inspection is required to correct the errors. Inspection requires workers with specific education, domain expertise, particular training, software knowledge and/or cultural understanding. Inspection workers must recognize documents, find relevant information on the documents and insure that the extracted data has been appropriately and accurately displayed in particular software programs. Typically, any changes made by inspection workers must be reviewed and approved by other, more senior, inspection workers before replacing the data extracted by optical character recognition.
Because automation requires human inspection, source documents with sensitive information are exposed in their entirety to data entry workers.
While the prior art attempts to reduce the cost of data entry through the use of low cost labor and limited automation, none of the above methods of data entry (1) eliminates the requirements of education, domain expertise, training, software knowledge and/or cultural understanding, (2) minimizes the time spent entering and quality checking the data, (3) minimizes errors and (4) protects the privacy of the owners of the data without being dependent on the security systems of data entry organizations. What is needed, therefore is a method of performing data entry that overcomes the above-mentioned limitations and that includes the features enumerated above.