Documents are generated at a far greater rate today than ever before. The proliferation of documents is due primarily to increased access to information technology and is unlikely to cease.
This proliferation of documents may become a large burden on the average consumer of information. One of the main challenges is access to needed information. Therefore, significant efforts are being made in the field of information retrieval and information extraction.
The goal of information retrieval is to permit document databases to be searched. Information extraction focuses on identifying predetermined types of information as new documents are entered into a document collection. An application of information retrieval and extraction is matching of documents. For example, a new entry by a forum user can be matched to existing entries that may contain relevant information for the forum user.