Traditional machine learning techniques use human annotators to manually apply labels to training data. However, manual techniques for annotating training data can be labor-intensive and inefficient. To address this difficulty, some recent techniques have attempted to leverage query click log data to automatically generate the training data. Query click log data identifies queries submitted by users of a search system, together with the sites that the users clicked on or otherwise selected in response to those queries. There is nevertheless room for improvement with respect to the quality of the training data produced by these automated techniques.