A modern organization typically maintains a data storage system to store and deliver sensitive information concerning various significant business aspects of the organization. Sensitive information may include data on customers (or patients), contracts, deliveries, supplies, employees, manufacturing, or the like. In addition, sensitive information may include intellectual property (IP) of an organization such as software code developed by employees of the organization, documents describing inventions conceived by employees of the organization, etc.
DLP technologies apply configurable rules to identify objects, such as files, that contain sensitive data and should not be found outside of a particular enterprise or specific set of host computers or storage devices. Even when these technologies are deployed, it is possible for sensitive objects to ‘leak’. Occasionally, leakage is deliberate and malicious, but often it is accidental too. For example, in today's global marketplace environment, a user of a computing system transmits data, knowingly or unknowingly, to a growing number of entities outside a computer network of an organization or enterprise. Previously, the number of entities were very limited, and within a very safe environment. For example, each person in an enterprise would just have a single desktop computer, and a limited number of software applications installed on the computer with predictable behavior. More recently, communications between entities may be complex and difficult for a human to monitor. For example, the mobile applications market is expected to exceed 20 billion dollars in upcoming years. It has become more common for users to install mobile applications on their mobile devices, such as handheld devices, mobile smart phone, tablets, netbooks, etc. Not all of these mobile applications (commonly referred to as apps) are developed by reliable entities. These apps may need to send data, for example, stock portfolio details, credit card details, health details, or other sensitive information to a server computing system to provide certain functionality. Also, users are continuously exchanging data with each other via computer social network sites like Facebook, MySpace, etc. In addition, various backup and security products installed on user machines may continuously send user data to a backup or security server. In other situations, whenever a user faces any software crash like a browser crash or application crash, the application may ask the user to tell the software provider about the crash, requiring the user to transmit some logs to an external entity. Although the software provider may promise to preserve anonymity, this information may still be exposed to humans for viewing. For example, if a browser crashed while transmitting credit card data, bank balance, etc, sending the log may expose the credit card and bank balance information.
Existing security techniques fail to provide efficient solutions that can protect organizations in the situations described above. These existing DLP technologies do not have a way to categorize the type of data for the particular destination entities receiving the data. For example, classifying entities into malicious and non-malicious is insufficient because all of these entities are presumably doing a useful service, as permitted by the user, and receive some useful data. These entities, however, are supposed to be receiving only certain types of data to perform the useful service. For example, a music application should only access and send music-related information, such as a playlist of music, to its server computing system. However, if other type of information, e.g., tax information, is saved in the same folder, the music application should not transmit the other type of information to the server computing system. Existing security techniques do not provide an efficient solution to prevent the application from transmitting the other data types to these destination entities that should not be receiving data of these other data types. Furthermore, these complex communications of different data types may occur between various different entities, and existing DLP technologies do not distinguish between the types of data being sent to these entities.