Human editors are commonly used to judge various content, including, but not limited to, content responsive to search queries, advertisements responsive to search queries, answers responsive to questions, etc. For example, human editors may be used to identify how relevant a given web page is in response to a query, or how pertinent a given advertisement is to a given search request. Human editors, however, are commonly inconsistent in their judgment of content. Moreover, various human editor may perceive content differently and accordingly judgments regarding such content may vary among human editors. For example, a first given human editor may rate the relevance of a given content item in response to a given search query as “excellent,” whereas a second given human editor may rate the relevance of the same content item in response to the same query as “fair.” Similarly, a single given human editor may rate a given content item as “highly relevant” in response to a given query on a first given date, however, the same editor may rate the same content item as “not relevant” in response to the same query on a second given date. Accordingly, human editors may not only differ with respect to other human editors regarding judgment of the same or similar content, but a single human editor may also differ with respect to their own prior judgements of a given item of content.
Current techniques for utilizing human editors to judge various content often compute agreement levels among human editors and thereafter discard or ignore human editor judgment data that is inconsistent or conflicting with previous human editor judgment data. For example, current techniques may discard a human editor's judgment of a given content item if the human editor's judgment conflicts with a previous judgment made by the human editor with respect to such content. Similarly, current techniques may discard or ignore a given human editor's judgment from among a pool of human editors' judgments if the given human editor's judgment is different or otherwise contradicts the judgments of the pool of human editors.
Accordingly, while current techniques are capable of utilizing human editor judgments for various content, such techniques fail to consider the entirety of the judgment data generated by such human editors and instead discard or ignore data that may be inconsistent or vary. Thus, there exists a need in the art for identifying drift data and variations among human editors and thereafter ascertaining correction factors for such human editors in order to utilize judgment data generated by human editors with respect to a given set of content.