1. Technical Field
This application relates to determining whether certain information is being shared in a computer network.
2. Background Information
File sharing is the practice of distributing or providing access to digitally stored information, such as computer programs, multimedia (audio, images and video), documents, or electronic books. Sharing mechanisms may include centralized servers, World Wide Web-based hyperlinked documents, or the use of file sharing networks. Sharing Networks may be implemented in a variety of ways such as using peer-to-peer technologies, bit torrent technologies, file hosting services and the like.
File sharing continues to rank as one of the most popular Internet applications. The ability to pool resources from thousands or millions of users makes filesharing an extremely attractive for a number of applications. However, such convenience and rapid accessibility to information is not without its risks. In particular, users that accidentally or unwittingly share private files can find personal and other sensitive information rapidly downloaded by other users all over the world.
Most businesses collect and store sensitive information about their employees and customers such as Social Security numbers, credit card and account information, medical and other personal data. Many of them have a legal obligation to protect this information against inadvertent disclosure. If such information gets in the wrong hands, it can lead to fraud and identity theft. People who use P2P filesharing software can end up inadvertently sharing files. They may accidentally choose to share drives and folders that contain sensitive information, or they could save a private file to a shared drive or folder by mistake, making a private file available to others. In addition, viruses and other malware can change the access to drives and folders designed for sharing, also putting private files at risk. As a result, instead of simply sharing their music files as intended, other sensitive information such as tax records, private medical records, work documents and so on end up being available via general circulation on filesharing networks.
The risks are very high for businesses as well as end users. For example, the United States Federal Trade Commission (FTC) has recently announced settlements against multiple companies who had illegally exposed sensitive personal information of their customers by allowing it to be shared on peer to peer (P2P) networks. These enforcement actions point out the serious implications of inadequate or nonexistent data privacy and security policies.
There are audit services for hire that can locate sensitive data in an organization and determine what sort of access can be gained to it via file sharing networks. In government and military end uses that can use in-depth standards for classifying the sensitivity of data such as “secret”, “top-secret” and so on. These classifications detail who can have access to the information and what level of security assurance should be implemented to protect against inadvertent disclosure.
Several problems occur when attempting to locate private files that include sensitive information on file sharing networks. The owner or custodian of the information wants to know if their file is being shared, but also even if pieces of the file are being shared. For example, a long list of credit card numbers may be compromised even if a small number of the credit card numbers are exposed. In addition, sensitive information may be rearranged or combined with other information to obfuscate it. Furthermore, the sensitive content may be split among multiple files. In addition the private file may contain classified or other highly sensitive information and yet the custodian of the information wishes to be able to avail themselves of the commercial services to locate the information, but without disclosing it entirely.