Modern businesses and industries rely heavily on the creation, storage and transportation of digital documents and other kinds of digital files as a primary means of communication, information storage, and documentation. In many cases, the digital documents and files contain proprietary and/or confidential material. In other cases, digital items may contain sensitive, offensive or provocative material. It is therefore imperative to allow for effective traffic filtering of the digital information.
One of the most prevalent means for digital content filtering and screening is keywords or key-phrase based filtering: the traffic of digital items is scanned in order to find whether pre-defined key-words, key-phrase, numbers etc. exist in the scanned item. In cases in which the item contains one or more of the previously stored keyword or key-phrase, a pre-defined policy is applied with respect to the distribution of the item (e.g., block the transmission of the item). One of the main problems with this method is the management overhead: keywords may be supplied by various entities and departments within the organization (e.g., legal, financial, human-resources, top-tier management etc.). If the inspected item contains one or more words or phrases from the list, the distribution of the item is, in many cases, blocked, and the administrator then needs to check the details of the event. Since keyword filtering may result in a high-rate of false alarms, this causes a significant management overhead on the administrator, which may cause the whole method to be rendered impractical.
Prior art solutions use cumbersome manual solutions to overcome the problem—e.g., negotiation with the originator of the keyword about a possible removal of the keyword or a policy relaxation.
There is thus a recognized need for, and it would be highly advantageous to have, a method and system that allows an efficient management of keywords filtering, which overcomes the drawbacks of current methods as described above.