Sarbanes-Oxley and several other regulations require compliance by businesses. These include, for example, HIPAA, OSHA, SB 1386, NASD 2711, and Gramm-Leach-Bliley.
The traditional method of proving compliance with a business procedure is to have a “paper trail.” For example, in order to establish that an expense was “real,” a receipt is stored. In order to justify a check to a vendor, there may be a signed purchase request, quotes from other vendors, a purchase order, a signed packing slip, and an invoice. These paper records are typically organized in some fashion, stapled, placed in folders, and organized in filing cabinets. If the organizational method is good, the paper documents can be later retrieved. Paper records can be filed in only one way, e.g. by check number, or by vendor, but not both. Thus, their retrieval by another method requires access to an index, e.g. a paper or electronic record that connects check numbers with payees.
Paper has a real advantage of being hard to duplicate. Even with a high quality printer, it is difficult to duplicate the paper stock and preprinted logos from an invoice for example. Likewise, it is difficult to duplicate a handwritten signature on a packing-list. However, anyone with a modest amount of skill with an image processing program like Adobe's Photoshop™ can modify an electronic copy of a document. It is easy for example to change the amount on an invoice or copy a scan of a signature from one document to another. Thus, tools that verify the authenticity of an electronic document are valuable in that they maintain or improve on the “trail” available with paper.
The past prevalence of paper means that almost all business processes are verified by some image. Real paper may have been replaced by an electronic PDF; however, it is still the visual record that is of interest. Even in the case where the official document is an Excel spreadsheet, the compliance “controls” are the values computed at various cells within the worksheet, which are labeled so they may be interpreted visually.
Electronic files (e.g., Word, Excel, PDF, etc.) may have formulas and other execution that alter the data and the presentation, e.g. different amounts can appear at different times in a spreadsheet. There may be different presentations for different devices or different users. Capturing, preserving, and authenticating the actual image presented to the human user (whether on paper or displayed on a monitor) is the only way to know what information the human had.
Many document management systems have been proposed and implemented in the past. These document management systems include systems that store documents and handle the coordination of requests with responses. However, these systems do not cut across organizational boundaries and do not perform the synchronization that is necessary.
Portals, Content Management systems, and Wikis handle bit-map images, allow search on tags, sometimes search on recognized file types (e.g., power points slides, graphics, text only).
Version control systems like ClearCase, SourceSafe, CVS, Subversion, and GIT detect changes in a family of documents and keep track of the order of modification. The “GIT” system uses hashes to identify changed files and directories. Some version control systems are integrated with a “workflow” for example to run a set of regression tests on the changed source code. Such systems do not have a visual representation or notion of control points.
Intrusion Detection systems like TripWire determine if any of a set of files on a computer system has been changed using cryptographic hashes.
A Web log is an online document management tool used to record information. Web logs use a client-server framework to permit the addition or subtraction of content from one or more client locations to a server that hosts the web log. Because one server hosts each web log, web logs are typically anchored to a particular HTTP location.
U.S. patent application Ser. No. 10/887,998, entitled “Synchronizing distributed work through document logs,” filed Jul. 9, 2004 by Wolff, Gregory J., et al., (Publication No. 20060010095) discloses synchronizing distributed work through the use of document logs. As disclosed, metadata entries are added to a set that is associated with a digital object, such as a document. The metadata entries are accessed using unique identifiers that reference the metadata entries. In one embodiment, each unique identifier is based on the contents of the metadata.