There are many applications such as relevance ranking, identification of intent, image classification and handwriting classification that employ machine learning techniques over manually labeled data. In such applications that use supervised learning techniques, a first step is to obtain manually labeled data. For this, human judges are provided with guidelines as to how to label a set of items (these items can be documents, images, queries and so forth, depending on the application).
These guidelines can be anywhere from a few sentences to tens of pages. While detailed guidelines serve to clarify the labeling criteria, in practice, it is often not possible for human judges to assimilate and apply all the guidelines consistently and correctly. The difficulty increases as the guidelines get longer and more complex. Further, most judges need to label a large number of items within a short span of time.
This results in noisy labels, which hinders the performance of the machine learning techniques and directly impacts the businesses that depend on these techniques. It also limits any ability to evaluate and compare against the competition, as these labels are also used during evaluation time.