Data Loss Prevention (DLP) involves computer and information security, where DLP systems identify, monitor, and protect data in use (e.g., endpoint actions), data in motion (e.g., network actions), and data at rest (e.g., data storage). Typically, a DLP system creates fingerprints of sensitive information that requires protection, and then uses the fingerprints to detect the presence of sensitive information in various files, messages and the like. Sensitive information may be stored in a structured form such as a database, a spreadsheet, etc., and may include, for example, customer, employee, patient or pricing data. In addition, sensitive information may include unstructured data such as design plans, source code, CAD drawings, financial reports, etc.
In order to effectively prevent the loss of sensitive information, it is important to identify whether newly introduced documents contain sensitive information that needs to be protected. However, current DLP solutions are unable to effectively classify newly introduced documents that do not exactly match an existing protected document profile.