A corporate organization regularly employs the Internet to communicate with customers and vendors, to conduct research, and to perform various other tasks. The organization also creates and maintains confidential and sensitive information as part of the usual course of business. Even when the organization has policies in place to protect the transmission of sensitive or confidential information, there is no efficient way to monitor for compliance with these policies. Thus far it has proven difficult to prevent misuse or theft of valuable sensitive or proprietary information. Such information includes financial data, personal information, confidential documents, intellectual property, and customer lists.
Theft of proprietary information is one of the most costly security problem facing companies today. A recent study estimated that losses of proprietary information and intellectual property cost U.S. corporations in excess of $50 billion per year. New government regulations impose penalties for violations of customers' private medical, financial and personal information. Theft of financial data, customer lists and intellectual property can impact revenues, increase legal costs, and erode long-term competitive advantages.
One attempt to address this problem is the use of Access Control Lists (ACLs) to enable or disable access to a document based on user identification or privilege level. However, a user may be granted access to a document and then inappropriately e-mail the document to a non-privileged user outside the organization. In addition, the organization typically has a set of business processes and infrastructure in place and a solution is required which minimally impacts these.
Additional complexities arise when identifying sensitive material. For example, it is difficult to individually mark each one of potentially thousands of documents as safe for external release or prohibited from transmission. Furthermore, such properties may change over time, for example as in the case of a datasheet. Initially the information is closely guarded and proprietary, but may later be publicly released. After the public release, external transmission of the document is allowable.
Tracking information movement by filename or Universal Resource Locator (URL) is also limiting, as users may copy sensitive information into a location with a different identifier. What must be protected are the contents of these documents. Even if only a portion of the sensitive information is released, the organization could be exposed to significant consequences.
Existing content-based approaches include keyword or key-phrase matching. However, this often results in false positives, i.e. identifying information as sensitive when in reality it is public. Blanket solutions that completely block external transmission of sensitive material to all destinations may be overly restrictive, as the organization may have remote locations accessible only via Internet.
Existing access control systems define who can see sensitive information, but they cannot control where the information goes once access is granted. Most organizations have little visibility into the actual transmission of sensitive information, and whether that information leaves internal networks for the outside world.
What is needed is an invention that addresses these shortcomings of the current art.