The information and knowledge assets created and accumulated by organizations and businesses are of extreme value in the modern economical environment. As such, managing and keeping the information and the knowledge inside the organization and restricting its distribution outside, is of paramount importance for almost any organization, government entity or business and provides a significant leverage over its value. Most of the information in modern organizations and businesses is represented in a digital format that can be easily distributed via digital communication networks. However, ease of the promptness, comfort and information availability offered by these digital networks is accompanied by a constant hazard of information leak due to innocent mistakes, carelessness and malicious attempts to deliver non-public or otherwise confidential information to unauthorized entities. Information losses can cause anything from minor embarrassment to severe financial damage by enabling fraud and by causing loss of business secrets and consequent competitive advantage. In addition, such loss may expose the organization to legal sanctions and liabilities (e.g., under the US Gramm-Leach-Bliley act, the US Sarbanes-Oxley act, the US HIPAA privacy and security regulations, and directive 95/46/EC of the European Parliament). In order to exploit the value of information and commercial knowledge to as large an extent as possible, whilst mitigating risks that stem from unauthorized dissemination of information, the information distribution needs to be carefully and skillfully managed.
Managing information distribution includes several aspects, such as:                Making the information explicitly available to authorized persons so that they can utilize the information in order to create value for the organization.        Assuring that the information remains intact—i.e., that the integrity of the information is conserved.        Restricting the information distribution to authorized persons only—i.e., maintaining the confidentiality of the information.        Tracking the information along its lifecycle, in order to obtain a clear understanding of the information flow and to allow for adequate information retention practice.        
Information assets in organizations and businesses evolve dynamically following their creation. During the evolution process, additional information and knowledge are created; added; destroyed; formats change; names change, etc. The process may have one, several or numerous contributors. Managing the information distribution along its lifecycle is therefore an involved task. In some cases, the information is relevant only within a limited time-window, and the value of the information sharply decreases after some time. E.g., the information that is relevant for predicting the price of a certain commodity at a certain time, becomes steadily less valuable as the time gets closer. In other cases, the information represents accumulated knowledge. In this case, the merit of the information may even increases with time. This state of affairs further complicates the information-management task.
Methods that attempt to track digital information and manage information distribution exist. Some of these methods utilize file meta-data, which may not be robust against changes in the file format. Other methods utilize keywords-based classification, which tends to be either over-exclusive or over-inclusive. Other methods restrict information usage and distribution to particular kinds of applications, commonly referred to as Digital Right Management (DRM) applications. DRM applications have the disadvantage that they hamper normal workflow and require large to massive investment levels. Still other methods consider the binary signature of the file, but this has the disadvantage of depending critically on the precise representation of the data.
The above methods thus do not provide an adequate solution to the problem of modern businesses for the reasons outlined above. The large number of formats in which the same information can be represented, the large number of applications that can use the same information in different ways, the large numbers of kinds of storage that the information can be kept in, and the large number of information distribution channels types, tend to render any given method ineffective over a business environment taken as a whole. File metadata is often altered when the format of the file or the storage medium of the data is changed. Binary digital signature is of zero-tolerance to any changes in the signed data, and keyword or key-phrase based tracking cover only a very limited aspect of the problem.
Methods for screening and filtering of digital content also exist and are widely used, in order, for example, to allow censorship of offending material (e.g., pornography). These methods lack the resolution needed for effective policy definition and enforcement, and tend to be over exclusive or over inclusive,
Methods that utilize sophisticated searching algorithms in databases and over the Internet also exist. These methods are optimized for information retrieval and for providing answers to specific queries, and, in general, cannot provide either for effective tracking of specific information items or for effective policy enforcement.
Another issue that further complicates the monitoring process is the so-called template document. In many cases, documents are derived from template type source documents, for example standard contracts. In these cases, the ability to monitor and track various different documents that are derived from the same or a similar source template cannot be based on any naïve notion of resemblance between the documents, since two documents that are derived from the same/similar template may be, on one hand, very similar, while, on the other hand, the differentiating details, such as the names of the sides of the contract, may be of considerable importance. Tracking different derivatives of a template document is not adequately addressed by current methods.
There is thus a recognized need for, and it would be highly advantageous to have, a method and system that allow information tracking and information distribution management along the information life cycle, which overcomes the drawbacks of current methods as described above.