Electronic discovery (also known as e-discovery or eDiscovery) refers to the process of identifying, or discovering data and/or electronic documents in a custodian's data stores which contain information relevant to a legal or administrative proceeding and for which there is a reason (often times a legal obligation) to make available to third parties, such as attorneys, courts, or service bureaus. Such electronic information is subject to local rules and agreed-upon processes, and is often reviewed for privilege and relevance before being turned over to opposing counsel.
In an electronic discovery workflow, potentially responsive documents are first searched and identified, typically with one or more commercially available e-discovery software tools, for further analysis and review. These may include emails, electronic texts, spreadsheets and other species of data in a custodian's stores that contain information that the custodian has a reason, such as a legal obligation, to provide or produce to another party in litigation or similar context. The identified documents are then placed in a legal hold to prevent them from getting destroyed. Once the potentially responsive documents are preserved, collection can begin. Collection refers to the transfer of data from a company to their legal counsel. Some companies may have electronic discovery software tools in place so that legal holds may be placed and collection may begin right away if necessary. Ordinarily, once collection(s) satisfying search criteria are generated, the documents and data are reviewed by humans to determine the extent, if any, to which the documents contain the information sought.
Such an electronic discovery software tool allows a user to identify potentially responsive documents by searching data storages for data matching criteria believed to be associated with documents or data containing potentially relevant or responsive information and adding it to a data or document collection. The user may then use the electronic discovery software tool to create a disk image of the collection and hand it over to the attorney(s) for review. As can be appreciated, such an electronic discovery process can be expensive, in that tens of thousands or more of documents, emails, etc., may be required to be found and copied. In some cases, such a disk image may comprise hundreds of data disks and present difficulties for “burning” as well as delivery by virtue of its size. In such cases where the disk image is very large, the electronic discovery process may become even more inefficient and/or expensive, as determining whether the documents and data in such a large collection is nonresponsive and another search and disk image is required may involve many man-hours of human follow-on review.
The term “disk image” is used here given the historical need to create physical disks, CDs, DVDs or tapes. In practice, this term may also encompass one or more large data sets or ZIP images that can be transmitted electronically.
Some electronic discovery software tools may allow a user to export a sample of documents in a collection and “test” it (e.g., by sending the sample documents to the attorney(s) for review) to see if the documents collected in the sample may be deemed relevant. If so, the entire collection may be produced. However, such methods necessitate a tradeoff between accuracy and performance. Furthermore, in general, by examining an entirety of a data set at a time, such methods are relatively slow, which can incentivize the user to value speed of performance more than accuracy, which may result in having to go back and assemble a new disk image, again raising costs.