1. Field of the Invention
The invention relates in general to the field of electronic data discovery (e-discovery). More particularly, the invention relates to methods and systems for storing electronic content in e-discovery management systems.
2. Background Discussion
Electronic discovery, also referred to as e-discovery or EDiscovery, concerns discovery in civil litigation, as well as tax, government investigation, and criminal proceedings, which deals with information in electronic form. In this context, electronic form is the representation of information as binary numbers. Electronic information is different from paper information because of its intangible form, volume, transience, and persistence. Also, electronic information is usually accompanied by metadata, which is rarely present in paper information. Electronic discovery poses new challenges and opportunities for attorneys, their clients, technical advisors, and the courts, as electronic information is collected, reviewed, and produced. Electronic discovery is the subject of amendments to the Federal Rules of Civil Procedure which are effective Dec. 1, 2006. In particular, for example, but not by way of limitation, Rules 16 and 26 are of interest to electronic discovery.
Examples of the types of data included in e-discovery include e-mail, instant messaging chats, Microsoft Office files, accounting databases, CAD/CAM files, Web sites, and any other electronically-stored information which could be relevant evidence in a law suit. Also included in e-discovery is raw data which forensic investigators can review for hidden evidence. The original file format is known as the native format. Litigators may review material from e-discovery in any one or more of several formats, for example, printed paper, native file, or as TIFF images.
The revisions to the Federal Rules formally address e-discovery and in the process, have made it a nearly certain element of litigation. For corporations, the rules place a very early focus on existing retention practices and the preservation and discovery of information. In response to the climate change in the e-discovery arena, corporations are 1) enhancing their processes for issuing legal holds and tracking collections, 2) looking for ways to reduce the costs of collecting, processing and reviewing electronic data, and 3) looking upstream to reduce the volume of unneeded data through better retention policies that are routinely enforced. The new field of e-discovery management has emerged to assist companies that are overwhelmed by the requirements imposed by the new rules and the spate of legal and regulatory activity regarding e-discovery.
Currently, e-discovery management applications (EMA) rely on a variety of approaches to store electronic data for e-discovery, as shown in FIG. 1A-C:                A. EMAs 101A store content as binary objects 102A in a database 103. Transaction information as well as file collections are typically stored in the same relational database 103 located on a database server;        B. EMAs 101B store content as content objects 102B in a content management system 104. EMAs can use a content management system (such as EMC DOCUMENTUM, EMC CORPORATION, Hopkinton, Mass.) to store unstructured content; and        C. EMAs 101C can use a local or networked file system 105 to store content as files 102C in a file system and a database to store file metadata.        
Such conventional methods provide convenience and functionality, such as allowing the data to be updated, allowing it to be checked in and checked out, and so on. However, data stored for the purpose of e-discovery typically has the character of being immutable and unstructured: the data is going to be permanently stored, or at least for a very long time; it is not going to be changed or updated or checked-in or -out very often and it is typically unnecessary to organize or structure the data in a database or content base. In view of the immutable, unstructured nature of e-discovery data, such conventional storage approaches, in spite of their convenience and functionality, involve a number of disadvantages:                High hardware cost: Databases, content management systems, and local file systems are usually stored in arrays of hard disks. The high hardware expense may be justified for transactional data, but it is exorbitant in the case of the immutable, unstructured content typically used in e-discovery;        High maintenance cost: In all of the above scenarios, maintenance requires a skilled administrator. In the case of a database, the administrator must be trained in database technology; in case of a content management system (which usually resides on top of a database), the administrator must also be skilled in content management systems. These maintenance costs may amount to hundreds of thousands of dollars in salary and thousands in training costs. As above, such expense may be justified for transactional data but is needless in the case immutable unstructured content;        Extra IT (information technology) planning and coordination: Necessary disk space must be projected and purchased upfront, requiring close involvement of IT personnel, e.g. coordination between parties such as the Chief Legal Officer and the Chief Information Officer;        High capital investment: To ensure available disk space, the company has to buy more disk space than it needs at any particular time; and        Inefficiencies in cost accounting: It would be beneficial to treat storage as a cost related to a particular litigation matter as opposed to a capital expense.        
Thus, there exists a need to provide a way of storing collected content in e-discovery applications that eliminates unnecessary expense and managerial and administrative overhead, achieving cost savings and simplifying operations. From an EMA vendor standpoint, it would be desirable for companies to be able to redirect a portion of their storage budgets away from purchase of storage hardware and software to purchasing low-cost storage from the EMA vendor.