1. Technical Field
The claimed subject matter relates generally to a method and system for document management and, more specifically, to de-duplication and modification detection of files collected during document production in a legal setting.
2. Description of the Related Art
The use of computers in business and personal life has enabled people to be more productive. Of course, this increased in productivity also implies that more documents are generated and stored. A large percent of generated documents exist either in part or solely in the form of electronic data storage and, as storage capacity has continued to increase and become cheaper, fewer documents are ever discarded or deleted.
In the United States, parties to legal proceedings are often given wide latitude to request and examine documents in the possession of other parties. Judicial rules relating to this process, or document discovery and production, often specify that electronically stored documents be provided in the original format and include any associated metadata. Typically, document discovery and production is both labor intensive and time-consuming, particularly in light of the large volume of electronically stored materials. A party that is required to meet such a request must locate all possible documents, filter out documents that are not responsive to a specific request or are privileged and provide access to the filtered materials.
Current methods for the production of electronically stored documents have been developed “ad hoc” and, therefore suffer from serious shortcomings. For example, computer hard drives are often mirrored, or “disk copied,” onto alternative hard drives, documents on the alternative hard drive are converted to physical formats such as printed paper and, then, personnel review each of the papers to ensure that relevant documents are produced, non-responsive documents are excluded and privileged documents are protected. One drawback of this approach is that many documents that are not relevant are printed, duplicated and reviewed, thus increasing the time and expense of document production and well as the change of errors.