With the widespread use and growth of networking with computers and communication systems, diverse issues relating to privacy, data security, fiduciary and other concerns have led to the establishment of various laws, rules, regulations, standards for various industries. Encouraging and enforcing compliance with these requirements has become a significant endeavor. Compliance networking has thus become a lively, well established field. Compliance Networking generally refers to methods implemented or action taken at the network to help ensure compliance with the aforementioned laws, rules, regulations, standards, etc.
For instance, confidentiality is an important, perhaps crucial concern to medical patients and social services clients. Thus, health care and social related entities such as commercial, non-profit and governmental hospitals, clinics, professional offices, pharmacies, welfare offices, etc. now typically operate with strict compliance standards in place to protect their patients' and clients' privacy interests. Special attention has been given for networks to assist in meeting such compliance standards.
Similarly, commercial businesses and financial institutions such as banks, credit unions, government revenue offices, etc. now typically operate with strict compliance standards in place to protect their own and their clients' privacy and financial interests. Further, technical, legal, military and other entities now typically operate with strict compliance standards in place to protect the security of their data, code, etc. As these examples illustrate, regulatory compliance has become a significant issue across a broad spectrum of modern activities. In as much as networks have become nearly ubiquitous, compliance networking has also become important in various industries.
Driven by standards and associated regulations, compliance networking equipment (hereinafter compliance equipment) is being used increasingly in an attempt to detect leakage of sensitive information. Just in the examples above for instance, numerous kinds of information are monitored for including intellectual property (IP) such as source codes, designs, confidential information such as patient records, social security, credit card and bank account numbers and classified military data among others. Compliance equipment is useful in monitoring for improper information transmittals as well, such as may include improperly accessed IP (e.g., software, music and movie downloads, etc.), objectionable matter (e.g., racist material, pornography, gambling, etc.), spam and other improper email and the like.
Compliance equipment typically monitors information traffic at gateway network access devices such as routers and switches that reside near the edge of a network. In this conventional configuration, the compliance equipment thus monitors traffic flowing out to and in from the Internet or another network. Compliance equipment thus detects information leakage in outgoing network traffic and records and reports its source, e.g., the source of the information leakage.
In monitoring the traffic, the compliance equipment examines the constituent packets of the traffic and effectively tries to reconstruct what that traffic comprises. In some instances (e.g., installations, situations, configurations, etc.), compliance equipment may effectively perform this function passively, e.g., without necessarily stopping or significantly impeding the information flow. For example, while the compliance equipment may record and report the leakage source, it does not necessarily stop the information from flowing out to the Internet or elsewhere.
However, in other instances, compliance equipment may intercept and capture information traffic deemed to violate a compliance standard. Thus, compliance equipment may actively deter release of violative or other non-compliant traffic. For example, in addition to recording and reporting a leakage source, compliance equipment can actively deter release of non-compliant traffic, e.g., effectively impeding or blocking the traffic from flowing out to and/or in from the network.
Compliance equipment is typically placed either in series with network information traffic, such as between two routers, switches, etc., or in an effectively off-line, tap and/or substantially parallel configuration relative thereto wherein it essentially taps the network traffic to listen thereto (e.g., snoop on, eavesdrop upon, etc.). A variety of kinds of compliance are currently used, each approaching compliance networking issues from a unique perspective and performing a specialized, distinguishable (e.g., differentiable) function related thereto.
Compliance equipment includes three kinds of surveillant systems: detection only devices, forensic devices and prevention devices. Detection only devices examine virtually all network traffic flowing through a gateway and record policy violations that they observe, typically in real time. Forensic devices endeavor to capture everything passing through, typically for off line (e.g., other than real time) scrutiny. Prevention devices block the flow of traffic that violates a compliance policy that they have been programmed to enforce.
While their perspectives and functions may vary, all three kinds of compliance equipment share some commonalities. For instance, each kind (e.g., type) of device is positioned effectively at the edge of a network, such as a business entity's or government agency's firewall, a department's or command's edge router, etc. Typically, the compliance device is practically (e.g., physically) located proximate to premises (e.g., offices, facilities, etc.) of an entity's information technology (IT) or like department. So deployed however, the compliance device is accessible (e.g., internally) to the people therein. This internal exposure can itself pose issues relating to compliance networking, such as where a compliance policy forbids IT personnel from having such proximity and access, e.g., to confidential personal information not releasable outside of a human resources or legal department.
The various types of compliance equipment also all take in virtually all of the traffic that passes through the gateway device, firewall, etc. with which it is associated. Thus to effectively monitor this traffic, their networking interfaces must match the peak bandwidth of the gateway's or firewall's flow through. High traffic volumes can thus raise issues relating to scalability, for instance where compliance equipment is used for surveilling a very large and/or active network.
Currently available compliance equipment has typical traffic handling capacities on the order of 100-400 megabytes. However, large modern corporate, financial, government, academic, scientific and other networks may reach peak traffic levels on the order of gigabits. To effectively handle such high gateway bandwidths, efficiency in performing compliance related processing and other functions can be a significant factor. Efficiency can be especially significant where an active, high bandwidth gateway is monitored with relatively modest compliance equipment.
To achieve performance efficiency, compliance equipment is typically programmed to classify network traffic and to handle its various classifications according to some discriminating scheme. A filtering process can focus the efficient use of compliance equipment bandwidth and processing resources. Thus, certain kinds of traffic are effectively ignored and heightened scrutiny is applied, e.g., in some efficient (e.g., controllable, reserved, economical, etc.) fashion, to other particular kinds. Filter devices used with compliance equipment are typically programmed to function according to a one or more of several parameters.
For instance, filtering may be performed on the basis of protocol, size and/or destination related information such as Internet Protocol (IP) addresses. Thus, traffic conforming to a certain programmed protocol, such as Simple Mail Transfer Protocol (SMTP), or traffic of a certain size characteristic, such as all files below one kilobyte (1 kB), is ignored. Similarly, traffic addressed to a particular range or list of IP subnets, addresses, etc., such as those associated with a competitor, a foreign entity, a suspect designation or destination, etc. is examined more closely.
Given the breadth of the spectrum of modern activities illustrated by the examples above and the sheer volume of network traffic, the number of classifications with which network traffic may be classified is large. However, the wide variety of information that may be “interesting,” e.g., worthy of compliance based scrutiny is also large. Conventional compliance equipment can optimally scan a large volume of various types of traffic, but may then be constrained to detect (e.g., denote for scrutiny, etc.) a relatively few kinds of information. Conversely, conventional compliance equipment can optimally detect a larger variety of information types, but may then be constrained by the volume and varying types of traffic.
This dichotomy in optimizing compliance based traffic surveillance reflects a granularity issue with which conventional compliance surveillance must contend. To program compliance equipment on the basis of a large number of classifications could be a dauntingly complicated proposition, such as where traffic volumes are great. Typically, the parameters by which filtering is performed are few. However, such coarse granularity can unfortunately result in somewhat inflexible compliance equipment functionality in some instances.
To provide breadth and depth of coverage and some flexibility in the types of documents and other information, conventional compliance networking equipment typically relies on examination of content of network traffic based on keywords, regular expressions, phrases, terms, syntax and/or semantics. While syntactic/semantic detection based compliance networking equipment can achieve remarkable breadth and depth of coverage, they may be prone to false positive detections.
False positives can occur because syntactic/semantic detection is based on parameter sets such as some number of keywords, regular expressions, binary signatures and the like. Cutting the number of false positives for higher accuracy results can be achieved by increasing the number of detection factors. However, increasing the number of detection factors can be computationally costly (e.g., CPU processing, memory usage, etc.). With large amounts of information such as an enterprise's document repositories, such computational costs can be significant.
Syntactic/semantic detection is performed at the application layer. To provide protection to a document, conventional equipment scans the document to detect syntactically and/or semantically interesting parameters therein. The document is reassembled and analyzed on the fly, e.g., as it is being sent as an email attachment. In order to detect content, all of the packets of the traffic are collected and the session effectively reassembled. The content is then reconstructed and decoded to understand context. The computational cost required to achieve this, e.g., with large traffic volume, is significant. In addition to the computational costs, where large amounts of information are to be protected, this can be inefficient and hinder traffic flow.