An important trend in securing an information technology infrastructure relates to preventing theft, disclosure, leakage, or other unauthorized propagation associated with sensitive data and information. For example, underground markets have already shown an ability to monetize data and documents improperly leaked or propagated in a manner that violates organizational policy or contravenes organizational boundaries, which has led to governments pursuing efforts to impose fines and other penalties on organizations that leak sensitive data. Consequently, an organization that inadvertently allows unauthorized access to sensitive data may be penalized on multiple fronts because fines may be imposed to penalize the data leakage, and moreover, the organization must further deal with risks that the leaked data may be improperly used to attack or otherwise compromise the organization. The problems associated with data leakage and document propagation can be expected to increase substantially in the near future because many (or most) lawmakers, regulators, security managers, and other “powers that be” have yet to realize the pace and extent to which sensitive data has become exposed and distributed.
For example, many emerging information technology services use cloud-based technologies to enable users to share files with others and transfer work between different computing environments, which can provide users with various benefits (e.g., overcoming restrictions on the size associated with files that be attached to individual emails, addressing problems that arise when an inbox grows too large because many emails have large file attachments, expanding access to files beyond internal file sharing services that otherwise limit access to users that are connected locally or via virtual private networks, and making files available in different computing environments despite the fact that many information technology departments do not offer easy ways to share files via public FTP servers and other traditional methods due to security concerns). Despite the potential benefits that cloud-based services may offer, security and usage associated therewith often violates corporate policies and security best practices. As such, organizations must assess how cloud-based technologies align with their security policies and compliance mandates and monitor usage associated with these technologies to ensure compliance and limit data exposure without undermining the benefits that these technologies offer. However, existing network security systems tend to have limitations in their ability to detect whether software to interact with cloud-based services has been installed on client computers (e.g., because the client software may not be actively uploading or downloading data when the client computers are scanned). Furthermore, existing network security systems typically cannot properly implement monitoring, encryption, and other security measures at a level that can appropriately detect and protect sensitive data from being insecurely transmitted to a cloud-based service. For example, many cloud-based services communicate data over trusted SSL sessions, but recent trends in the network security community have discovered several recent attacks that have circumvented SSL security and compromised SSL certificate authorities, whereby data transmitted to or from cloud-based service may be susceptible to improper leakage even if SSL has been properly implemented.
Moreover, the problems that relate to data leaking and documents propagating in a manner that violates policy are not unique to cloud-based services or other threats that may be external to a managed information technology infrastructure. Indeed, many data leakage and document propagation problems arise because authorized employees improperly engage in certain restricted activities, outsiders infiltrate the infrastructure to perform apparently authorized activities, or information technology resources have exploitable vulnerabilities, among other things. For example, many employees like to access their music collections at the workplace, which may raise liabilities such as potential fines or penalties due to users improperly sharing copyrighted content on the network or network degradation because file sharing activity occupies available bandwidth to download content. In another example, many organizations may have sensitive corporate and customer data inadvertently or maliciously disclosed because the sensitive data was “too available” to employees that did not actually require access. However, existing network security systems typically cannot establish a comprehensive inventory to identify particular servers, computers, or other resources that typically host sensitive corporate and customer documents, nor can existing network security systems detect whether network traffic may include sensitive corporate and customer documents in transit, which interferes with the ability to know where sensitive content may be hosted and thereby prevent, detect, and remediate data leakage and document propagation incidents. In particular, almost every resource within a particular network will typically generate various events to describe activity associated with the device, yet correlating events that relate to many devices distributed across a network tends to be very difficult because the events may have different formats, describe different activities, repeat certain events multiple times, or have large volumes that can be difficult to analyze in a useful manner. Furthermore, managing changes and access controls presents important challenges because certain activity patterns may reflect security breaches, compliance issues, or other risks that sensitive data and documents are being leaked or improperly propagated.
Accordingly, network security practitioners and managers are continuously presented with the difficult task to balance tradeoffs between controlling certain risky activities that can be performed on a network without restricting those potentially risky activities to the extent that potentially valuable business opportunities may be disrupted. In the network security context, probabilities are rarely simple, which tends to require network security practitioners to estimate the likelihood that vulnerabilities may be exploited against estimated business benefits that those vulnerabilities may offer. In other words, properly managing a network involves a delicate balance between ensuring that users have the freedom to perform activities that will benefit business while employing measures that can properly prevent, detect, and mitigate the risks that may arise if data or documents leak or otherwise propagate across organizational boundaries in a manner that violates policy. However, existing network security systems tend to fall short in managing these problems due to the complexity involved in suitably classifying all the resources that are hosted on or interact with an information technology infrastructure, identifying where certain files are located in the infrastructure, and detecting atypical deviations that relate to certain files appearing in suspicious places or moving from one location to another.