In the electronic information age, people may share, access, and disseminate information in seemingly unlimited volume. The ability to disseminate information in electronic format is enormously empowering. At the same time, the workforce has become increasingly mobile, and the ubiquity of high-speed Internet access, smart mobile devices, and portable storage means that “the office” may be anywhere. As a consequence, it has become more difficult than ever for organizations to prevent the loss of sensitive data. Organizations are therefore increasingly looking to Data Loss Prevention (“DLP”) solutions to protect their sensitive data.
A typical DLP system may include a data-protocol parser, a textual-content extractor, a content-matching engine, and a rules-enforcement engine. Data analyzed by a DLP system may be processed by each of these engines to determine whether an enforcement action, such as blocking transmission of a file, quarantining a file, or creating a security violation, should occur. The two most computationally expensive stages in DLP may be content extraction and content matching. These DLP stages may tax numerous resources, causing application timeouts, higher load on network processors, and central-processing-unit spikes on local systems. Because of the cost of content extraction and matching, efficient and thorough DLP may not be possible using traditional DLP systems in some situations.