It has become commonplace to use computer systems to search large collections of content. For example, a user may submit a search request to a search facility and, in response, receive a search result set corresponding to related content in a content collection. Such search result sets can be relatively large, and the user may review only a portion of the search result set. If the content most relevant to the search request is not discoverable by the user, the user can become frustrated and even abandon the search. In a commercial context, such frustration can have significant financial consequences for the service provider. One technique that attempts to minimize the user's frustration is to display the higher ranked search results first so that the search results of interest are likely to be discovered by the user.
It is not uncommon for search result ranking to take into account historical behavioral data with respect to corresponding content. For example, historical behavioral data may include previous searches for the content and actions taken with respect to content such as accessing the content and engaging in transactions (e.g., financial transactions) with respect to content. Ranking techniques that account for historical behavioral data (e.g., that use history-dependent ranking components) can be sophisticated. For example, the behavioral data may be decayed so that more recent behavioral data is weighted more heavily than older behavioral data. Various data filtering and windowing techniques, such as moving averages, may also be employed. Content recently added to the content collection (“recent content”) may be disadvantaged with respect to search result rank relative to content already existing in the content collection (“established content”). For example, some conventional search facilities determine search result rankings based in part on a decayed moving average of sales. In this example, recent content associated with a same rate of sales as established content can rank lower in the search results relative to the established content even though the user may be just as interested in the recent content. As a further example, some conventional search facilities determine search result rankings based in part on decayed counts of clicks of links to content presented in previous search results (“click-throughs”). In this further example, recent content may be disadvantaged (e.g., rank lower) because it has had less of a chance to accumulate click-throughs.
Some conventional search facilities attempt to compensate for the recent content disadvantage with respect to historical behavioral data by “seeding” some history-dependent ranking component values. For example, a search facility administrator may manually estimate what the history-dependent ranking component value will be once the recent content becomes established. However, manually estimating such ranking component seed values can be labor intensive, subjective and/or inaccurate. Some conventional search facilities attempt to address these issues by using content category averages as ranking component seed values. However, this ranking component seed value estimation can be inaccurate and/or unsatisfactory. Rank and/or ranking component value overestimation can be as detrimental to the user search experience as underestimation. Furthermore, the response of ranking component values over time to seed values can be unsatisfactory, and even unsuitable.
Same numbers are used throughout the disclosure and figures to reference like components and features, but such repetition of number is for purposes of simplicity of explanation and understanding, and should not be viewed as a limitation on the various embodiments.