Most of the current enterprise search engines are based on an inverted index architecture. The inverted index is used to save the mapping of saving positions of a certain token in a document or a document set during a full text index. The term “token” as used herein includes at least one character in the document or the document set, for example, a letter, a word, a phrase or the like. When searching, all the documents containing the token will be searched.
However, the biggest problem suffered by this search is search efficiency and precision. It is to be understood that, for each token, there may be a large number of documents containing the token. Conversely, in one document, each token may appear several times. The solution of the conventional full text search engine assigns the same weight to various regions of a document. This greatly reduces the search efficiency and accuracy, since a token appearing in an important component (for example, a title, an abstract, and/or keyword(s)) of a document usually represents the content of the document.