1. Field of the Invention
This invention pertains in general to computer security, and more specifically to detecting confidential data that has been transformed to evade detection.
2. Description of the Related Art
The environments in which employees of a company work have changed dramatically over the years. Employees now commonly have mobile devices, including laptop computers and other devices, that make it possible for them to work remotely and communicate with one another from almost anywhere. Company data is constantly being accessed and shared by employees all over the world. Enhanced mobility and improved communication channels have revolutionized the way companies function.
Yet, with these mobility and communication enhancements come increased risks that sensitive or proprietary information can be unintentionally or maliciously transmitted outside of the company. Authorized users can accidentally send out confidential information or otherwise compromise sensitive data when communicating with other users. Similarly, malicious users from within the company or outside can intentionally transmit company proprietary data outside of the company network for unauthorized use. In either case, the company suffers a loss due to the potential exposure of its confidential data, including a possible loss of intellectual property rights, a risk of lawsuits due to release of a client's private data, a threat of malicious usage of the data against the company or its clients, and many other troubling possibilities.
Given the immensity of the problems associated with loss of confidential data, it is essential that companies prevent these types of losses. Data loss prevention (DLP) products are one mechanism for curbing data loss. DLP products have a number of mechanisms for identifying confidential information (e.g., deep content analysis, including using dictionaries, keywords, or regular expressions, using partial document fingerprinting, etc.). Network DLP products or gateway-based solutions generally run on a company's internet network connection, and analyze network traffic for transmissions of confidential data. Host-based DLP products run on end-user workstations or company servers, and manage information flow between users, including controlling email and other communications.
While generally effective for data loss prevention, DLP products typically cannot detect confidential data that has been transformed to evade detection. DLP products perform an analysis on text and can crack well known file formats to extract textual information from files. However, transformations of data that are unknown or not easily reversed can make files inaccessible to DLP products by encrypting or otherwise protecting all or part of the files' content from being read or interpreted, thereby evading the desired analysis by DLP products. Password file encryption, for example, or even a simple XOR encryption, can make a file inaccessible to DLP applications. This leaves substantial holes in the standard DLP solutions that allow users to transform and thus hide confidential data, which the user can then easily transmit outside of the company without DLP detection.
Therefore, there is a need in the art for a solution that controls transmission of transformed data outside of a company to prevent release of confidential data.