1. Field of Art
The present invention generally relates to the field of search engine technologies with application, for example, to the area of data leakage prevention.
2. Description of the Related Art
Conventional Search Engines
In general, an enterprise search engine is a software system to search relevant documents with given query statements. The enterprise search engine typically consists of a crawler, an indexer, a searcher and a query engine. The crawler gathers documents from pre-assigned locations and dumps them into document repositories. The indexer fetches documents from the document repositories, creates indices from the documents, and stores the indices into an index database. The searcher searches the index database and returns a list of relevant documents (referenced as “hits”) in response to a specific query. The query engine parses a query expression provided by a user and sends query commands to searcher for processing.
Consider, for example, the conventional search system 100 that is depicted in FIG. 1. The conventional search system 100 may fetch documents from one or more document sources 105(a-n) that are stored in a document repository 110. The documents from document sources 105(a-n) are indexed by a search engine 120, and the indexed documents 122 are stored in an index database 124.
Subsequently, a user 150 seeking information may use a query composer 130 to compose a query to search documents 126 in the search engine 120. The search may then be conducted by the search engine 120 against the indexed documents 122 in the index database 124. When a match or matches (i.e. “hits”) are found corresponding to the query, the search engine 120 returns the matching indexed documents as search results 135 that are presented to the user 150.