In a typical enterprise environment, the amount of data that is maintained and processed is enormous and rapidly increasing. Information technology (IT) departments to have to deal with many millions or even billions of files, in dozens of formats. Moreover, the existing number tends to grow at a significant (e.g., double-digit yearly growth) rate.
With such data size and growth, a number of complex scenarios need to be considered by IT departments, including with respect to compliance, security, and storage. These scenarios are relevant for unstructured data (e.g., files), semi-structured data (e.g., files with property repositories) and structured data (e.g., databases). Often these data are not actively managed, and are kept in unstructured form in file shares.
To manage access to resources (objects) such as files, present security models are based on having access control policies on the objects that allow legitimate users to have access while restricting the access of unauthorized users. However, in addition to securing access based on business policy via an access control list (ACL) on the resource containing the data, enterprises also are looking to secure data based on content sensitivity.
By way of example, consider a file with a security policy that grants read access to several hundred users in a security group. If at some time the file contents are inadvertently updated such that the file exposes customer record data, a company may no longer want to provide such access to the entire security group. However there is no automatic mechanism for detecting the content change and then revising the security policy.
Changed content in a file may have other implications on how a company would like data to be handled. For example, a company may want a change in content that adds sensitive data to alter how the data may be distributed, such as to prevent a file containing the data from being attached in an email, or copied onto a portable storage device (e.g., a USB device) in clear text.
Preventing access and/or distribution as a result of changed content is not possible with existing security models. This results in unintended information leakage and insider breach of data, and is a significant issue facing a number of enterprises and the like, including in regulated industries and in the public sector.