    U.S. Pat. No. 7,124,438, “Systems and methods for anomaly detection in patterns of monitored communications”, Paul Judge et al, Issue date: Oct. 17, 2006.    U.S. Pat. No. 6,507,866, “E-mail usage pattern detection”, Ronald Barchi, Issue date: Jan. 14, 2003.    U.S. Pat. No. 6,735,701, “Network policy management and effectiveness system”, Andrea M. Jacobson, Issue date: May 11, 2004.    U.S. patent application Ser. No. 11/347,463, “Method and a System for Outbound Content Security in Computer Networks”, Leonid Goldstein, Publication date: Aug. 23, 2007.    U.S. patent application Ser. No. 10/892,615, “Method and Apparatus for Creating an Information Security Policy Based on a Pre-configured Template.”, Chris Jones et. al., Publication date: Apr. 21, 2005.    U.S. patent application Ser. No. 11/485,537, “Methods and System for Information Leak Prevention”, Lidror Troyansky et al. Publication date: Feb. 1, 2007.    US Patent application number: PCT/US2006/005317, “Methods and Apparatus for Handling Messages containing Pre-selected data”, Vontu Inc., Publication date: Aug. 24, 2006.    U.S. patent application Ser. No. 11/173,941, “Message Profiling Systems and Methods”, Paul Judge et. al., Publication date: Jan. 19, 2006.    U.S. patent application Ser. No. 11/284,666, “Adaptive System for Content Monitoring”, Ramanathan Jagadeesan et. al., Publication date: Jun. 7, 2007.    U.S. patent application Ser. No. 10/780,252, “Method and Apparatus to detect Unauthorized Information Disclosure via Content Anomaly Detection”, Pratyush Moghe, Publication date: Apr. 28, 2005.    U.S. patent application Ser. No. 11/761,839, “Techniques for Creating Computer Generated Notes”, Bobick, Mark and Wimmer, Carl, Publication Date: Jan. 24, 2008.    U.S. patent application Ser. No. 11/781,419, “Knowledge Discovery Agent System and Method”, Estes, Timothy W., Publication Date: Jan. 17, 2008.    U.S. patent application Ser. No. 11/656,017, “Method and computer program product for converting ontologies into concept semantic networks”, Elfayoumy, Sherif A., and Mohan, Rengaswamy, Publication Date: Aug. 16, 2007.
Prior art considers the problem of information leakage as a content inspection and detection problem. These techniques look at the content of e-mails and try to determine if any sensitive information is being leaked out. Prior art also had looked at pattern anomaly detection, but that too was done from the content scanning perspective using pre-defined regular expressions or keywords, pre-determined policies, or information depending on the number and frequency of mails between senders and recipients. Thus, the outbound e-mail contents were read and information about these contents was then used to identify information leakage. For example, if the mail content contained specific keywords, a leakage was detected. Or, if some mails seemed to have certain words that are not usually the kind used by the sender, that mail will be flagged as an anomaly.
In some cases, both the sender and recipient information together with the time of sending and the frequency of mails were used. However, all of these techniques rely on word lists and key phrases, either pre-defined or found using frequency analysis.
None of these techniques present the user with a well-defined and friendly way of sifting through a set of possible words to match a desired level of accuracy. No existing invention utilizes user feedback in mail analysis to provide the user with several alternative word schemes which can generate a chosen level of accuracy on a set of e-mails.