A modern organization typically maintains a data storage system to store and deliver records concerning various significant business aspects of the organization. Stored records may include data on customers (or patients), pricing, contracts, deliveries, supplies, employees, manufacturing, etc. A data storage system of an organization may utilize relational databases, client/server applications built on top of relational databases (e.g., Oracle, Microsoft, Siebel, SAP, etc.), object-oriented databases, object-relational databases, document stores and file systems that store table formatted data (e.g., CSV files, Microsoft Excel spreadsheet files, etc.), password systems, single-sign-on systems, etc.
A data storage system of an organization typically runs on a computer connected to a local area network (LAN). This computer may be made accessible to the Internet via a firewall, router, or other packet switching devices. Although the connectivity of a storage system to the network provides for more efficient utilization of information maintained by the storage system, it also poses security problems due to the highly sensitive nature of this information. In particular, because access to the contents of the data storage system is essential to the job function of many employees in the organization, there are many possible points of possible theft or accidental distribution of this information. Theft of information represents a significant business risk both in terms of the value of the intellectual property as well as the legal liabilities related to regulatory compliance. In order to prevent malicious and unintentional data breaches, commercial and government regulations often impose restrictions on how confidential data may be stored, the format of confidential data, who can access that confidential data, as well as whether confidential data may be transmitted (e.g., by email). In order to comply with these regulations, companies create policies to govern how confidential data is stored in the various applications, in what format the confidential information is stored, who can access that confidential data, and to prevent transmission of confidential data.
Some conventional systems employ a blanket Data Loss Prevention (DLP) policy to accurately detect policy violations in information transfers of the protected data between one or more parties. The DLP policy includes conditions that define the information to be protected (e.g., confidential information) and an action that needs to be taken when any event matches this condition. The protected information can leave an organization through web operations, such as using Hypertext Transfer Protocol (HTTP) or HTTPS requests, or by file transfers, such as using File Transfer Protocol (FTP). The protected information can also leave an organization through other network protocols, such as Simple Mail Transfer Protocol (SMTP), Transmission Control Protocol (TCP), etc. While the Internet has provided a platform for information sharing and communication, it has also opened paths for confidential data to leave an organization in day to day activities on websites, such as through webmail applications, blogs, and discussion groups. Loss of sensitive data through websites can be controlled by blocking the outgoing web request that triggers the DLP violation and sending notification back to a user as a response for this web request. This scheme may work fine for web requests sent to non-interactive websites, where the web requests are generated as a result of end user action. This conventional scheme, however, may be unfavorable for web requests sent to highly-interactive websites (e.g., Web 2.0 websites). For example, the Web 2.0 websites allow users to do more than just retrieve information, but can also provide “Network as platform” computing, allowing users to run web-based applications entirely through a web browser, for example. Although sensitive data leaving an organization through websites can be blocked by conventional DLP solutions, highly-interactive interfaces, as seen in highly-interactive websites, do not handle this block operation effectively and can lead to poor user experiences by making the web-based interactive application unstable or even cause the web-based application to crash. It is also hard to decipher application errors caused by the block operation. Creating customized responses for each Web 2.0 website for DLP purposes would be an arduous task, which can easily become invalid once the website changes its underlying technology.