1. Field of the Art
The present specification generally relates to the field of selecting promotions for display to customers via a computer network, such as the Internet. More specifically, the present specification relates in some cases to a technology for selecting one or more promotions to be presented to online customers using Bayesian bandits.
2. Description of the Related Art
A developer of a website is often faced with the decision of which version of an advertisement (ad) to place on a webpage. Suppose, there are two versions of the ad, a red version A and a blue version B. The developer wishes to place the version of the ad that will garner the most clicks, but does not know in advance which version that is. The traditional approach is to run an A/B test: Each time the page is served, a random choice is made about whether to display the red version or the blue version of the ad, with a 50/50 chance of either version being displayed. The A/B test proceeds for a period of time, during which time the click-through rate (CTR) of each of the versions is measured. Once the A/B test is complete, the version of the ad with the highest measured CTR is displayed thereafter.
But questions may still remain: Was the A/B test run for a sufficient period of time to acquire enough data to make a confident measurement of the click-through rate of the ads? With insufficient data, random variation in the click-through rates can result in the inferior ad being chosen over the superior one, a decision which will negatively impact future performance. On the other hand, another question that may arise: Was the A/B test run for too long a period? Because the superior and the inferior ads are shown during the A/B test, pages served with the inferior ad can result in missed clicks. This is what motivates the name “regret” that is given to performance measures of reinforcement learning algorithms. A further question may also be raised: What if the click-through rate varies over time? Suppose that the blue ad comes out on top in an A/B test that was run in June. If the designer runs the blue ad for the rest of the year, there may be opportunity costs associated with potentially higher click rates for the red ad in December. This leads to further questions: Should A/B tests be run periodically? And if yes, then how often? How long? Existing advertisement selection procedures fail to provide optimal solutions to these questions.