This disclosure relates generally to online systems, and more specifically to validating performance of a machine learning classifier by efficiently identifying members of a minority class among a population with a class imbalance.
Online systems, such as social networking systems, allow users to connect to and to communicate with other users of the online system. Users may create profiles on an online system that are tied to their identities and include information about the users, such as interests and demographic information. The users may be individuals or entities such as corporations or charities. Online systems allow users to easily communicate and to share content with other online system users by providing content to an online system for presentation to other users. Content provided to an online system by a user (i.e., user-provided content) may be declarative information provided by a user, status updates, images, photographs, videos, text data, any other information a user wishes to share with other users of the online system, or a combination thereof. User-provided content may include sponsored content that a sponsoring user (e.g., an organization) requests to be presented to other users who are not necessarily connected with the sponsoring user.
To ensure a high-quality user experience, online systems may remove low-quality content having characteristics violating a content policy. Content may be deemed low-quality because it contains offensive, unintelligible, or malicious elements. Offensive elements include text, images, or videos that are suggestive, violent, sensational, or illegal. Unintelligible elements include poor grammar, illegible words, words in a language different from a user's language, or an image obscured by overlaid text. Malicious elements may collect private information, misrepresent a product or service, or deliver malware to a user's computer.
The online system maintains a review process to identify instances of low-quality content before the online system presents them to viewing users. In order to evaluate the review process, human reviewers may manually classify user-provided content accepted by the review process. For the evaluation process to produce meaningful results, the evaluated subset should include significant quantities of both acceptable-quality content and low-quality content. However, most user-provided content complies with content policies maintained by an online system, and the review process filters out most low-quality content and thus prevents it from being presented to other users. As a result, it is difficult to identify low-quality content presented to users, and evaluations of the review process use large quantities of human review time to identify instances of low-quality content presented to users.