Field
The subject matter discussed herein relates generally to methods and systems of data loss prevention while also preserving privacy, and more particularly, to a protocol between an external host and an enterprise to perform data loss prevention in a manner that protects private information.
Related Art
In the related art, enterprises must employ technique to prevent leakage, loss or oversharing of sensitive data. Related approaches include data loss prevention or detection (“DLP”) systems, in which enterprises implement algorithms to detect and prevent data loss. The related art algorithms include straight or context aware pattern matching approaches. For example, Enterprise A may wish to protect the privacy of data, such as account names, numbers, or other private information. Thus, for any outgoing data the enterprise (e.g., email, download, export), Enterprise A runs software containing the algorithm to attempt to determine whether data loss has occurred (e.g., sending of account numbers, names, etc.).
However, the related art approaches have various problems and disadvantages. For example, related art pattern matching algorithms are not capable of determining which data is considered sensitive by the entity being protected (e.g., the enterprise itself, or a user at the enterprise). Thus, the related art approaches result in missed detections and high level of noise. In the above example, there is a risk that Enterprise A may fail to detect a candidate for data loss, or may characterize outgoing data as a data loss, when it is not actually a data loss. In one situation, a member of Enterprise A may have private information (e.g., personal account information not related to Enterprise A) that appears to be similar to the information owned by Enterprise A. Due to the similarity, Enterprise A characterizes the private information of the member as a data leakage, when it is not a data leakage.
Other related art approaches involve external hosts, such as DLP vendors or hosting companies, including cloud service providers, which receive a policy or rules from the enterprise or data owners that specify the sensitive data that is a potential leak. For example, Enterprise A may contract External Host B to check the information of Enterprise A for data leakage, and report the result to Enterprise A.
However, these related art approaches also have problems and disadvantages. For example, the enterprise or data owner must provide the sensitive data to a third party, which results in a risk of disclosing the sensitive data to the DLP external host (e.g., loss prevention system vendor, servicing or hosting company). In the above example, External Host B has the sensitive data of Enterprise A. Thus, the sensitive data of Enterprise A is disclosed to a third party outside of the firewall of Enterprise A, and Enterprise A cannot control what happens to the sensitive data at External Host B. Further, if detection patterns of Enterprise A are disclosed to an external DLP instead of the actual sensitive data, the result may also be a high level of noise (e.g., private information incorrectly characterized as data leakage).