In the arts, a digital document can often be represented as a raster, i.e., a rectangular array of samples or pixels, wherein, in the simplest non-trivial form, a page is decomposed into three layers which are: a lower layer, a mask layer, and an upper layer. The mask layer indicates from which of the two layers one should take a pixel to produce a final image. One advantage of this model is that it can afford greater image compression by separating a page into two image types and using respective compression schemes optimal for each type. Further, one can separate a page image into a plurality of image types, not just three. In the N-layer model, there is a lower layer (the background) layer, a first mask layer, a first foreground layer, a second mask layer, a second foreground layer, and so on.
Digital document images can be represented in varying degrees of complexity depending on what is known about them. One of the simplest and easiest to interchange is a raster representation. As an array of pixel values, its simplicity is nearly that of a paper document. Yet another representation is in portable document format. At the other extreme, a document can be represented in a markup language or page description language which details each symbol in the document and how it should appear when printed or displayed. These latter forms are typically produced by a word processor or electronic publishing system. The raster form is typical for a scanned document, i.e., a paper document ‘read’ by an image capture device. Converting raster documents to a markup version can be a difficult technical problem and is often the subject of academic and industrial research in document recognition. Except in specialized applications, a ‘recognized’ document is likely to contain many errors. The mixed raster content model offers a practical intermediate form, i.e., more structure than a raster but less than a marked-up document. As a collection of rasters, it also enables simple and ubiquitous document image interchange. The present invention herein is directed towards a secure document workflow system in which the mixed raster content model is exploited to perform a function for which it was not designed but for which it is consistent, to extend the utility of this model.
Consider the following example. A credit card processing center wishes to scan and store credit card receipts as a record of its customers' purchases. The system enables one of its trusted operators to view a receipt image online perhaps to verify a charge. Printed facsimiles of each customer's receipts can also be mailed to that customer. For security reasons, it is not preferred that a customer's credit card number be printed on a facsimile document. A customer would know his or her own number(s) so there would be no reason for an intermediary party to view it. The system stores receipt images in a mixed raster content format wherein the lower layer contains all the information except for the credit card number field and the mask or upper layer contains the credit card number field. The raster layer is preferably encrypted with a key available only to a trusted operator or the customer, or both. The encrypted layer is unintelligible to a printing system. The upper and lower layers may have been selectively encrypted.
And, consider a further workflow scenario wherein it is desired that different parts of a scanned document be viewable by different parties. For example, an online resume service may encrypt certain information and make it available for paying customers such as a scanned resume and cover letter may have document regions indicating race and gender encrypted by one key and salary requirements encrypted by another key. The actual encryption method depends on the application and any known in the prior art may be used. These requirements are met by an N-layer mixed raster content model in which each layer has a different key available only to the appropriate person in the hiring process.
In both of the above discussed considerations, documents are obtained as rasters yet full and error-free conversion to a markup language is expensive. In the credit-card case, very few of the documents would be viewed so it makes little sense to convert them.
Determining which areas to encrypt may be done automatically by knowing which regions on a document contain sensitive information. Regions may also be determined by a scanner operator or regions may be determined through document recognition: e.g., identifying credit card numbers or social security numbers through optical character recognition in combination with a lexicographical analyzer.
In a document workflow system, there is a need to provide a plurality of security levels and also a need to use existing image encoding methods to reduce cost and increase system flexibility. Further, security must be provided simply for scanned documents that exist as images. Word processing systems' passwords do not provide a plurality of access levels. In word processing systems these documents exist as symbols and are not scanned.
In other cases, secure faxing and printing systems treat the entire document and do not allow for fields or regions to be encrypted separately.
U.S. Pat. No. 4,912,761 entitled “Security Facsimile Systems” discloses a method to ensure secure fax transmission by splitting a page into to separations, neither one of which is readable alone. Upon reception, each separation is printed out separate pages, one on paper and the other on a transparency. The designated recipient overlays the two sheets to see the document. This invention underscores the need for secure transmission. This invention allows anyone to pick up the two sheets and combine them; there is no encryption. Each sheet is unreadable and therefore cannot offer selective viewing to different readers with different levels of security. Both separations are needed simultaneously, and the page is read as a whole or not at all.
A document would be stored in electronic memory in a FAX machine until a key is entered into the machine (this can be also done by having a secure storage bin or mail box and a physical key to unlock the box). What is also needed in the art is to allow different users the ability to see only portions of a document. This would be important in a fulfillment workflow where customers fax in forms with credit card numbers entered. Only those operators needing to process the credit authorization can be allowed to view or print the credit card information. Other, less trusted operators can commence order processing.
U.S. Pat. No. 5,372,387 entitled “Security Device for Document Protection” discloses a chemical means to obtain some of the functionality to which the present invention is directed, e.g., areas of a document are coated with an opaque substance that becomes clear when heated wherein secure information is printed then coated. A special reader is then used to disclose the secure information. Also, chemicals with different heat sensitivities may be used to afford different levels of security. The drawback being that it does not secure electronic documents necessary in modern work flows. Levels of security are less flexible, i.e., those with the hotter reader can read more. Anyone can conceivably obtain a reader, or security is limited by availability of readers. This is not as secure as encryption which is mathematically unlimited.
U.S. Pat. No. 5,812,989 entitled “Image Statement Preparation by Work Flow Management Using Statement Cycles and Statement Interactive Cycles” discloses a system in which bank checks are processed to provide account statements that include payments against an account and images of corresponding checks drawing on that account. In a high-volume work flow environment where alphanumeric account information and check images are obtained at different stations, it is possible for information to be out of synchronization. Operators can view suspected erroneous statements and make corrections. Operators have different levels of authorization depending on their skill level: some problems are easy to fix, some require more effort or have less tolerance for error. Profiles of operators are stored in a database, so that upon logging into the system, an operator can see only statements authorized for an operator's skill level. This system demonstrates the need for selective security in a document workflow system. However, security is offered document-by-document, not region-by-region within a document.
The present invention also covers an application in which regions in a scanned document are available in a given time interval. There are time-sensitive work flows in which fields expire or become relevant only after a certain date or time has passed. For example, a work flow in which activity must be accounted, but summarized monthly. A document for next month may arrive now, but should not appear on this month's summary. By encrypting fields with a time-dependent key, an operator would not have the possibility of seeing confusing fields. Also, some information becomes stale, such as expired credit card numbers, and could be readily viewed by all without harm.
Encrypted regions could be of any shape and be used to shield any image content such as faces, adult content, license plate numbers, logos, or any other identifying or offensive image portions.
It is the aim of the present invention to overcome the above identified prior art problems.