In the electronic information age, people may share, access, and disseminate high volumes of information. The ease with which information may be disseminated electronically is empowering. At the same time, the workforce has become increasingly mobile, and the ubiquity of high-speed Internet access, smart mobile devices, and portable storage means that “the office” may be anywhere. As a consequence, it has become more difficult than ever for organizations to prevent the loss of sensitive data. Organizations are therefore increasingly looking to data loss prevention (“DLP”) solutions to protect their sensitive data.
Traditional DLP systems may and intercept data at a variety of points in a computing system in efforts to detect and regulate the flow and access of sensitive data. Some traditional DLP systems may allow administrators to define keywords and/or regular expressions to identify potentially sensitive documents. Additionally or alternatively, some traditional DLP systems may employ classifiers generated with machine learning techniques. For example, these DLP systems may use training documents supplied by administrators to generate classifiers, and then apply the classifiers to documents to make DLP classifications. Unfortunately, these machine-learning based classifiers may function without administrators understanding the basis for many of their classifications. Furthermore, these classifiers may yield an unacceptable rate of false positives. Accordingly, the instant disclosure identifies and addresses a need for systems and methods for transparent data loss prevention classifications.