A search engine is a computer program or a set of programs used to index information and search for indexed information. One of the tasks performed by a search engine is the computation of the relevance of objects that are searchable by the search engine. Relevance is a complex topic, and includes many variables and methods for assigning relevance scores. Within this relevance computation, one of the key elements is assessing contribution to relevance based on properties whose importance may decline over time.
Consider an example. A news article is posted on a social web site. One hundred (100) readers “like” the news article, and three (3) other readers “dislike” the article. Using feedback mechanisms on the web site, they make their likes and dislikes known. The 100 “likes” and 3 “dislikes” now become useful information for search relevance. If another reader searches for information, the search engine can consider that there is a net difference of 97 positive reactions to this article, and adjust the relevance of the comment to rank higher in the search results.
The web site is topical, however. The importance of the likes and dislikes need to be discounted over time. If no additional reactions are recorded, then a month later the 97 likes should be given less consideration in scoring the search relevance. However, if readers continue to express strong “likes” for the article, then continued high relevance is appropriate.
Another example is frequency of use. In a document management system, a document that is accessed many times is more likely to be relevant in search results than a document which has not been accessed frequently, or which has been accessed only rarely in recent weeks. From a search relevance perspective, the number of accesses should increase relevance, but the relevance should gradually drop over time if it is no longer accessed. The recent frequency of document access becomes an important factor for the search engine attached to the document management system in determining the relevance of the document.
Search engines today are capable of incorporating numeric scoring modifiers into their relevance computations. The challenges that arise are related to keeping these modifiers current, given that they are expected to change over time.
As an example, assume that a modifier should be devalued by 1 every day until it reaches a value of 0. One way of implementing this process is for a controlling application to keep track of the modifier, reducing it by 1 each day, and issuing a transaction to the search engine every day for every object that needs to have the modifier changed. A transaction is an operation or set of operations. Each transaction in this case is atomic by nature, which means that either all of the operations in the transaction occur, or none of them occur. An application can perform multiple operations and calculations in a single transaction. One example of a controlling application may be document management software. With this approach, each such controlling application may issue a very large number of transactions to a search engine, impacting the performance of the search engine. This approach also places a part of the burden of computing modifier values on each controlling application.
Another approach would be to implement a “last changed” time and date associated with each modifier in the search index. The relevance computation could then devalue the modifier based on the difference between the last changed time and the current time. A problem with this approach is that storing the additional time information increases the memory requirements in the search engine, and the additional computation steps during the relevance ranking process can reduce the speed of evaluating search queries. If the controlling application wants to update the modifier, it may not have access to the current “effective” value of the modifier that is hidden within the search engine's relevance computation code.
Given the deficiencies in conventional search engines, there is room for innovations and improvements.