Predicting advertisement click-through rates (CTR) is a massive-scale learning problem. Traditionally click-through rates were predicted by continuous supervision based on an Follow The (Proximally) Regularized Leader (FTRL-Proximal) online learning algorithm and the use of per-coordinates learning rates.
Online advertising is a multi-billion dollar industry that has served as one of the great success stories for machine learning. Sponsored search advertising, contextual advertising, and real-time bidding auctions have all relied heavily on the ability of learned models to predict click-through rates of advertisements accurately, quickly, and reliably.
In practice, predicting click-through rates may be relatively easy to determine for advertisements that have been previously displayed in online auctions, especially for those advertisements that have been displayed many times and consequently have substantial click history which may be collected. However, where there may be minimal click history for advertisements predicting click-through rates may be difficult to accurately estimate. Moreover, for new advertisements, predicting click-through rates may be unknown to the online system. Accordingly, an online system must somehow predict the click-through rates for advertisements with minimal or no click history. It is a challenge to accurately predict click-through rates for such advertisements that would allow a search engine to display the most relevant advertisements and to price them correctly in an online auction. Given the large scale of search engine traffic, small errors in finding this probability can result in much lost revenue and in an adverse user experience.
In general, click-through rates are predicted by querying nodes of trie data structures. Trie data structures are tree data structures for storing a set of strings. A trie data structure turns a string set into a digital search tree. Several operations may be supported using the data structure, such as mapping the strings to integers, retrieving a string from the hie data structure, performing prefix searches and many others. The trie data structure comprises one or more known value type nodes and one or more unknown value type nodes.
Currently the combinations of unknown value type nodes are considered. For each combination the system queries the trie data structure along with the values of known value type nodes and consider the combination that has maximum value for the path in the trie data structure. But, this approach has many disadvantages like for each combination of unknown value type nodes the system queries the trie data structure to get its associated value, the best solution may be limited to the sampled combinations as it may not be practical to traverse the trie data structure for all the combinations, for same considered path in the trie data structure the system may traverse the trie data structure multiple times for different sampled combinations getting the same value, the system needs to maintain combinations of unknown value type nodes.
In light of the above discussion, there is a need for a method and system, which overcomes all the above stated problems.