Data Loss Prevention (DLP) involves computer and information security, where DLP systems identify, monitor, and protect data in use (e.g., endpoint actions), data in motion (e.g., network actions), and data at rest (e.g., data storage). Typically, a DLP system creates fingerprints of sensitive information that requires protection, and then uses the fingerprints to detect the presence of sensitive information in various files, messages and the like. Sensitive information may be stored in a structured form such as a database, a spreadsheet, etc., and may include, for example, customer, employee, patient or pricing data. In addition, sensitive information may include unstructured data such as design plans, source code, CAD drawings, financial reports, etc.
In order to effectively prevent the loss of sensitive information, it is important to scan newly introduced and newly modified documents to determine whether they contain sensitive information that needs to be protected. However, current DLP solutions are unable to effectively identify which documents have already been scanned and which documents have been introduced or modified since a scan last took place. Because of this, current DLP solutions require every scan to re-scan documents that have already been scanned and are therefore inefficient.