1. Field of the Invention
Embodiments of the present invention generally relate to data protection and archival systems and, more particularly, to a method and apparatus for configuring e-discovery data items for data leakage prevention.
2. Description of the Related Art
In a computing environment for an organization, a significant amount of data is stored in data storage systems (e.g. a repository). The data may be confidential and/or privileged to the organization. The amount of the data is due to the rapid growth in the size of the organization leads to inefficient management of the data for example, difficulty in discovery of the data during several proceedings like litigation, legal compliance and the like. As a result, the data is stored in the repository for future use. Subsequently, the data may be subject to legal review during a litigation/case. But, such data is vulnerable to leakage. Further, a rise in number of computing points (e.g., computers and servers) and easier modes of communication (e.g., Instant Messenger (IM), Universal Serial Bus (USB), cell phones) results in accidental or even intentional data leakage within or outside the organization.
Current Data Leakage Prevention (DLP) software is configured with pre-defined rules to detect and/or to prevent the unauthorized actions including transmission of the data within or outside the organization. The rules in the DLP software are framed on the basis of what the organization perceives as confidential or privileged data for that organization and thus, the rules may differ for different organizations. In addition, the DLP software helps in identification of the privileged data like the organization's Intellectual property, personal identifiable information like social security number and credit card number, health records and the like.
Consequently, the confidential and/or privileged data that may not be defined by the rules of the DLP software are at risk of being leaked even after utilizing the DLP software. For example, data under legal hold (during litigation) may be considered as the confidential data. As an example, when a data item is reviewed in the context of a court case, the data item may be identified as attorney client communication and hence, marked or selected as “privileged” by e-discovery software. The data item may be selected manually by legal reviewers or automatically by a classification engine in the e-discovery software (e.g., SYMANTEC Discovery Accelerator). The data item, however, is not automatically configured for data leakage prevention. Furthermore, data that is to be produced in a court of law (e.g., affidavits, motions and/or the like) are not prevented from being leaked to unwanted parties.
Therefore, there is a need in the art for a method and apparatus for preventing data leakage of e-discovery data items.