Many conventional click modeling techniques attempt to interpret the users' click data in order to predict their clicking behavior. Click prediction models used for this purpose are typically built using click-through logs of large numbers of users collected from search engines. Such models are often used for applications such as web search ranking, ad click-through rate (CTR) prediction, personalized click-based recommendation systems, etc.
One of the main foci of click prediction models is query based click prediction which attempts to compute the probability that a given document in a search-result page is clicked on after a user enters some query. The intent of such techniques is generally to learn user-perceived relevance for query-document pairs. In most click prediction models, dwell time is used as an intrinsic relevance signal. Other factors that are often considered include the position bias problem (where links having higher or more prominent positions on a page are generally more likely to be clicked on, even if such links tend to be less relevant to the user), revisiting behaviors, post-click behaviors, freshness for news searches, etc.
Click prediction models are frequently used for online advertising and sponsored searches (i.e., search results with recommended ads). In such cases, ads are typically ranked according to their likelihood of being relevant to the user and their likelihood of generating high revenue, which are highly correlated with the probability of a user click. Consequently, accurate click prediction is highly correlated to the success of sponsored searches and ad placements. Modeling approaches used for such purposes include ad click prediction using historical CTR and references therein, using ads related to content of documents being viewed by the user, exploiting mutual influence between ads, relation click prediction for multiple ads on a page, modeling positional bias in sponsored searches, using multimedia features in ad click prediction, etc. Related approaches use personalized click models that combine user CTR and browsing metadata (e.g., demographics) to improve personalized and sponsored search. Unfortunately, since such methods tend to rely on ad query and query augmented metadata, the techniques used in ad click prediction and sponsored search are not directly applicable to determining what a user will be interested in relative to arbitrary documents.
Similarly, contextual advertising places ads within the content of generic third party web pages. This type of placement is usually performed by some commercial intermediary (ad-network) which attempt to optimize ad selection in a manner that both increases ad revenue and improves user experience. Various studies have shown that the success of such techniques is closely related to accurate click prediction. Ad representation with word/phrase vectors has also been shown to work well. Various extensions to such techniques include the use of models that combine click feedback, forecasting ad-impressions, etc. Unfortunately, these types of models tend to rely on matching page content with the subject of the ad, and are thus unable to predict browsing transitions from arbitrary pages.
The majority of the click prediction techniques summarized above employ probabilistic models based on historical CTR. A related approach employs statistical machine translation to learn semantic translation of queries to document titles. Similarly, another related approach uses probabilistic models for discovering entity classes from query logs to identify latent intents in entity centric searches. Unfortunately, such models apply to individual page/query pairs and associated metadata and are thus unable to predict browsing transitions from arbitrary pages.