Often, it can be desirable to know how the results produced by a particular system would have changed if that system had been developed using a different set of parameters. For example, a developer of a computer application program that enables a computing device to play chess may wish to know whether a particular strategy would have resulted in the computing device playing a better game of chess than the strategy that was actually implemented. Questions directed to how a particular result would have been different had a different set of circumstances existed are known as “counterfactual” considerations.
Within the context of the development of computer-executable instructions directed to one or more tasks, counterfactual considerations are typically evaluated by modifying the computer-executable instructions to account for a different set of circumstances, or to be based on a different set of parameters, and then observing the result of the execution of such modified computer-executable instructions. When developing computer-executable instructions that are directed to the performance of tasks guided by a particular user, such a mechanism for evaluating counterfactual considerations can be practical, as many different variants of computer-executable instructions can be generated in parallel and tested in parallel.
However, for computer-executable instructions directed to the performance of tasks involving a large group of individuals, such as the computer-executable instructions that provide services to many thousands of users via networks of interconnected computing devices, counterfactual considerations can be difficult to evaluate. For example, it can be difficult to properly simulate the behavior of large groups of human users, such as would traditionally make use of such computer-executable instructions. Similarly, exposing large groups of human users to different alternatives of computer-executable instructions, designed to evaluate one or more counterfactual considerations, risks alienating such human users due to, for example, possible sub optimal performance by one or more of those alternative computer-executable instructions.
One context within which counterfactual considerations often arise is in the selection, and display, of advertisements through networks of interconnected computing devices. Traditionally, the selection of advertisements that are to be displayed to users communicating with a service being provided via networks of interconnected computing devices, are based on models that seek to predict how useful the displayed advertisements are going to be to the particular users to which they are being displayed and, as a result, possibly causing the users to select such advertisements, thereby generating revenue for the entity displaying the advertisements. Since the behavior of such a large and diverse pool of users can be difficult to accurately model, traditional mechanisms for evaluating alternative approaches to the selection of advertisements to be displayed to particular users rely on testing such mechanisms on a small sample of actual users. However, as will be recognized by those skilled in the art, such traditional mechanisms are inefficient, both in that they can require substantial setup effort, and in that they can require an extended duration to collect sufficient results, which can, themselves, be too varied to derive meaningful data from. Additionally, such traditional mechanisms can limit the amount of counterfactual considerations that can be tested due simply to a limit on the amount of users, and visits, via which to test such counterfactual considerations.