1. Field of the Invention
The present invention concerns a method for retrieving documents via an electronic data file, as well as a system for realizing this method.
In the first place, the invention is meant to be used in data systems in which electronic documents or parts of these documents are stored so as to be able to retrieve the required documents by means of specific search keys later on. In a more general way, however, the invention can be applied in any system whatsoever which includes electronic documents.
In particular, the invention aims a method for indexing electronic documents in a special manner so as to be able to retrieve them later on by means of specific search keys.
2. Description of Related Art
Different methods for indexing and subsequently retrieving data from electronic documents are already known. Examples thereof are described among others in the patents U.S. Pat. No. 5,007,019, U.S. Pat. No. 5,371,807, U.S. Pat. No. 5,418,951, U.S. Pat. No. 5,555,408, WO 95/14973 and WO 96/23265.
In general, one could say that there are three methods for translating textual data in indexes. The first method is an automatic, xe2x80x98non-intelligentxe2x80x99 indexing method. According to this method, words are automatically retrieved from a text by means of an evaluation system and they are integrated in an index.
The second method is the manual xe2x80x98intelligentxe2x80x99 indexing method. This method usually makes use of predetermined coordinates. The person who indexes the documents assigns one or several labels to each document, on the basis of which the document can be retrieved later on.
The third method is the automatic xe2x80x98intelligentxe2x80x99 indexing method. Here, the documentation officer of the second method is replaced by an automatic system.
It is clear that the quality of the method with which the right documents can be retrieved in no time later on depends on the keys which are used for indexing. We mainly distinguish two basic keys.
A first basic key concerns the xe2x80x98exhaustibilityxe2x80x99, by which is implied in how far the content of a specific document is entirely stored by means of the index. A second basic key is the specificity, which is determined in view of the precision with which searched documents can be retrieved.
It is clear that, in order to be able to retrieve the right documents in less then no time, a method is required offering an ideal balance between the possibility for retrieving the documents on the one hand, and the precision with which they can be retrieved on the other hand. In the case of exhaustive indexing, situations are created whereby, when searching for specific documents which are related to a specific subject, a large number of documents will appear, including a lot of information which is worthless, however. In this case, the retrieved documentation is said to contain a lot of xe2x80x98noisexe2x80x99. A high precision implies that only useful information is indexed by assigning very precise labels to it.
Further, it is known that one can index by what is called xe2x80x98single-term indexingxe2x80x99, whereby indexes are assigned to single terms, i.e. words, or by what is called xe2x80x98term relationship indexingxe2x80x99, whereby indexes are assigned which allow for relationships between different concepts.
The systems for electronic document management which are known until now are disadvantageous in that they are all mainly based on statistical formulas and do not make use of indexing methods based on knowledge, or in that, when use is made of indexing methods based on knowledge, the methods used thereby are little efficient.
The present invention aims a method for managing documents, in particular for storing and/or retrieving documents which enable the end-user to obtain relevant information in a very efficient manner, by which is meant that the right documents can be retrieved with very great precision and without any considerable xe2x80x98noisexe2x80x99.
To this aim, the invention first of all provides for a method for retrieving documents via an electronic data file, characterized in that keys are used for the retrieval which find one or several relations between the textual data of the documents concerned.