Many organizations develop confidential, trade secret, proprietary, and other information that is important to the successful operation of each such organization. In many cases, it is very important for an organization to ensure that this information is not disclosed outside the organization. If such information is disclosed outside the organization, the information may become valueless or will result in substantial harm to the organization. For example, a manufacturing company may develop a list of features to be incorporated in the next version of a product. If a competitor is able to ascertain the list of features before the next version is released, then the competitor may be able to use the information to their competitive advantage. As another example, an organization may need to take an internal disciplinary action against an employee who has violated some rule of the organization. If the violation became public, it may present a public relations problem for the organization. To ensure that their confidential information is not improperly disclosed, many organizations implement extensive measures to ensure that no such disclosure occurs. For example, some companies conduct training sessions with their employees to ensure that they understand the importance of maintaining the confidentiality of trade secrets, that the employees know to mark all documentation that contains trade secrets as confidential, and so on.
Although electronic communications have allowed employees of organizations to communicate effectively and productively, electronic communications have also allowed confidential information to be easily and rapidly disseminated outside organizations. For example, if a leader of a design team sends an electronic mail message itemizing the new features of the next version of a product to the members of the team, then any member of the team can forward the message to other employees of the company or even to the employees of a competitor. Such distribution of confidential information to an employee of a competitor could be inadvertent or intentional. For example, an employee may want to forward the electronic mail message itemizing the new features to several members of the company's marketing team. When forwarding the electronic mail message, the employee may enter the partial names of the intended recipients. However, if an intended recipient has a name similar to an employee of a competitor, the electronic mail program may resolve the partial name to the electronic mail address of the competitor's employee. Even though a disclosure may be inadvertent, the company can, nevertheless, be seriously harmed. It may be even more problematic when an employee intentionally forwards the electronic mail message with the confidential information to someone who is unauthorized to receive such information. In such a case, the employee may try to mask the confidential nature of the information by, for example, removing notifications of confidentiality (e.g., “This document contains confidential, proprietary, and trade secret information of The Acme Company.”) from the electronic mail message. Moreover, unauthorized disclosure of confidential information is not limited to electronic mail messages; unauthorized disclosures can take other forms of electronic communications. For example, employees can disclose confidential information via Internet news and discussion groups, instant messaging systems, attachments to electronic mail messages, press releases, electronic presentations, published articles, and so on.
Some electronic mail systems have features that allow for the filtering of electronic mail messages to ensure that they do not contain inappropriate content. For example, such a system may scan outgoing messages for indications of confidential information such as the words “proprietary,” “confidential,” or “trade secret.” If such words are found in a message, then the system may prohibit the sending of the message. However, not all electronic mail messages that contain confidential information have such words. For example, employees on a design team may frequently send electronic mail messages to one another to get informal feedback on new ideas. In such cases, the electronic mail messages would not typically contain notices of confidentiality. In addition, an employee who intentionally wants to send confidential information to a competitor can easily avoid detection by such systems by removing such words from the message before forwarding it.
It would be desirable to have a system that would be able to reliably detect the presence of confidential information in electronic mail messages and more generally in any outgoing communication (e.g., publication, news group posting, and electronic mail attachments). In the case of an electronic mail message, such a system should be able to detect when an employee simply forwards an original electronic mail message without any modification, when the employee cuts and pastes portions of the original electronic mail message into a new electronic mail message, when the employee forwards portions of the original electronic mail message with additional comments, when the employee modifies the content of the original electronic mail message, and so on. Moreover, because of the volume of electronic mail messages that an organization may generate, it would be desirable that such a system would be able to rapidly detect such confidential information in electronic mail messages without significantly delaying delivery and without having to make significant investment in additional hardware and software to support such detection.