This disclosure relates generally to computer processes for selecting content items for users, and more specifically to using feedback control on parameters of a content selection process to meet latency and CPU utilization targets for that content selection process.
Presenting a user with content items that are relevant to the user increases both revenue for the online system and user enjoyment of and engagement with the online system. Conventionally, online systems use a content selection system that applies targeting or filtering rules to various content items for selecting content items to present to a user. For example, the content selection system identifies content items for which a user is in a defined target audience and then ranks these eligible content items in a content selection process, such as an online auction. The content selection system selects content items for presentation to the viewing user.
This selection process defines a latency period, which begins when an opportunity to present content to a user is identified by the online system and ends when the selected content items are sent for presentation to the user. The duration of the latency period depends on one or more content selection parameters, such as the number of content items to be ranked and the complexity of the models being used to score each of the content items. The latency may also depend on changes in traffic on the online system, e.g., caused by requests to present content to other users of the online system. Various factors may affect the traffic on the online system, such as time of day, occurrence of events, unexpected events, or other factors causing fluctuations in the demand placed on computing resources of the online system.
While the quality of content selected for presentation generally increases as the number of content items evaluated by the online system's content selection process increases and as the complexity of the models used to score the content items increases, the latency period and/or CPU utilization required to complete the selection process also increases due to limited computing resources of the online system. Because limited computing resources evaluate content items, increasing the number of content items evaluated causes a decline in system performance, which may result in a longer latency period, system delays, and possible network time-outs. Conversely, if the online system evaluates a small number of content items, the latency period is shorter, system performance may improve, but the quality of the content items presented to the users likely declines. This balance between latency and content quality is important for optimizing user engagement and revenue for an online system.
One method of processing a large number of content items without a long latency period is to use parallel computing to evaluate multiple content items simultaneously. In parallel computing, the rate at which the evaluation is done is adjusted by the number of threads. To maintain a target latency period, the number of parallel threads can be adjusted based on the variations in the number of content items that are evaluated. For example, if the number of content items being evaluated increases and is close to network time-out due to high latency period, the number of threads may be increased to evaluate the content items within a threshold of the target latency period. However, when the number of threads increases, the CPU utilization increases because more computation is performed at the same time. In this method, the latency period and the CPU utilization are coupled together and are not controlled independently. For example, if latency period needs to be increased but the CPU utilization needs to be decreased, there may be conflicting controls for the number of content items to be evaluated.