Information within organizations and entities is often classified as sensitive either for business reasons or for legal reasons. This information may reside within text files, databases, images, pictures, etc. In addition to the potential threat of an unscrupulous party illegally accessing the organization from the outside via an electronic network, and then removing or disrupting the information, there exists the risk of intentional or inadvertent transmission of the sensitive information from inside the organization to the outside. For example, a disgruntled employee might send a sensitive data file to which he or she has access to an outside party via e-mail, thus causing harm to the organization.
In addition to simple business reasons for not wanting sensitive information to be released, i.e., the desire to keep trade secrets secret, many new government regulations mandate controls over information (requiring the sensitive information not to be released outside the company) and companies must comply in view of significant penalties. For example, HIPAA regulates health information, BASEL II regulates financial information, Sarbanes-Oxley regulates corporate governance, and a large number of states have passed data privacy laws requiring organizations to notify consumers if their information is released. Companies are even subject to a regular information technology audit which they can fail if they do not employ suitable controls and standards.
Technology companies have reacted to this environment with a host of data loss prevention (DLP) products. These products are typically hardware/software platforms that monitor and prevent sensitive information from being leaked outside the company. These DLP products are also known as data leak prevention, information leak prevention, etc. Gateway-based DLP products are typically installed at the company's Internet network connection and analyze outgoing network traffic for unauthorized transmission of sensitive information. These products typically generate a unique signature of the sensitive information when stored within the company, and then look for these signatures as information passes out over the network boundary, searching for the signatures of the sensitive information. Host-based DLP products typically run on end-user workstations within the organization. These products can address internal as well as external release of information and can also control information flow between groups of users within an organization. These products can also monitor electronic mail and instant messaging communications and block them before they are sent.
Detecting and preventing the leaking of sensitive images can be especially problematic. FIG. 1 illustrates a prior art technique. In this example, image 10 is a sensitive image that the company wishes to keep within the company. To that end, it employs a data loss prevention product 20 that has generated a unique signature for this image 10 while the image is stored within the company. For example, any suitable hash function, such as the MD5 algorithm, may be used to generate a unique signature. When the user attempts to send 62 the image outside of the company to an entity 30 outside of the company boundary (that is not trusted) the DLP product 20 automatically generates the signature of the image to be sent and compares it against a list of sensitive image signatures. A check 64 reveals that the signatures match and the image is blocked 66 from being sent.
But, an unscrupulous user may use an image editor 40 to make minor changes 11, 12 to image 10′. Or, the user may simply convert the image to another image format. The user then attempts to leak the image by sending it 72 to an untrusted party 30 outside the company boundary. When the DLP product 20 then checks 74 the signature of the revised image against the list of sensitive image signatures, there is no match because image 10′ has a different unique signature than original image 10 because of the modifications. The image is then passed 76 through the company boundary to the untrusted party.
What is desired is an improved technique for preventing the loss of sensitive images.