The exemplary embodiment relates to incentive compatible machine learning and, in particular, to the estimation of payments for reports. It finds application in connection with assessment of the value of an existing report based on subsequent reports.
Online sites today sometimes offer mechanisms for reporting on workers, services, friends, products, or media along multiple dimensions, such as reliability, quality of service and friendliness. However several issues can undermine such reporting mechanisms, notably underprovisioning and dishonest reporting. Underprovisioning is the notion that may not be worthwhile for a reviewer to spend much effort constructing a detailed and honest report unless the reviewer is rewarded for it. Dishonest reports can arise when payments are made for reports. Even if reports are filtered for low-effort, nonsense reports, it can be possible to obtain higher payments by carefully-constructed dishonest reports.
Recently, peer prediction methods have been proposed to attempt to overcome these issues. These methods make payments for a report on the basis of how well the report predicts other reports, while maintaining a posterior belief about the distribution of reports. The payments need not be in terms of money, but can be other tokens such as “rating points” with perceived value or “lottery tickets” with ex-ante value. Payments in terms of money can be handled in many ways, such as transfers between raters or reductions in commission.
Miller, et al. identify some peer prediction methods that have useful properties such as individual rationality (payments can be made large enough to overcome underprovisioning); incentive compatibility (honest reporting achieves maximal payments for a rater, assuming that other raters are reporting honestly). (See Nolan Miller, Paul Resnick, and Richard Zeckhauser, Eliciting informative feedback: The peer-prediction method. Management Science, 51, 2005).
Private values for dishonest reports can still bias such mechanisms. Private values arise for numerous reasons. Sometimes, a rater prefers to be nice, to avoid confrontation or to ensure that others will have a positive impression of them as a rater. In some cases, a rater is a friend of a worker being rated and wishes them to find jobs more easily in future. As another example, a buyer on an online auction site gives a seller that they are happy with a bad rating in order to reduce bids for that seller's future offerings, in order to obtain those offerings more cheaply for themselves.
Thus the actual extent to which a peer prediction method results in honest reporting depends on how the payments are scaled to outweigh these private values.
Other problems exist with rating mechanisms as follows. Whitewashing: a rated entity who has achieved a bad rating might recover a default rating by reincarnating their identity. This can be limited by charging fees for joining the rating mechanism. Collusion: several raters may agree with each other to give dishonest reports in order to maximize their payments. It is generally the case that when a large fraction of raters colludes, there are no mechanisms that can induce truthfulness. However auditing and threats of legal action can be effective. Unravelling: if an item is rated by a string of people, whose incentive payments depend on future ratings, the final raters will have no-one to incentivize them. Therefore they may be dishonest. The same argument then applies recursively to the previous raters. This can be addressed by scrambling the order in which reports are displayed and used. Variable point of view: different raters may have different rating abilities or perspectives on rated entities. Thus payment rules which depend on assumptions that true ratings come from one probability density will tend to look on such raters as dishonest, and therefore penalize them. Observation quality: human judgements of means degrade when data is highly skewed and judgements of variances and properties of tails of distributions suffer serious misconceptions. (See P. Garthwaite, J. Kadane, and A. O'Hagan, Statistical methods for eliciting probability distributions, J. Am. Statistical Assoc., 100(470):680-700, 2005) Thus, the meaningfulness of reports is limited and care must be taken in deciding what to request from a report. Risk aversion: some scoring rules may result in arbitrarily large payments.
The problems of peer prediction are increased where reports relate to combinations of persons (or more generally, entities) and forecasts of future reports are sought for previously unseen combinations of entities. For example, a report may be sought for a job executed by a team of individuals who have not previously worked together or for a contractor who is using a new set of subcontractors. One field where this arises is in MicroWork-division of work into relatively fine-grained tasks (“microtasks”), as well as distribution of the microtasks to MicroWork providers. MicroWork customers may specify one or more microtasks for an overall processing task, such as the creation of electronic documents, and may register or publish the microtasks at a computerized MicroWork broker, e.g., one maintained by a MicroWork service provider. MicroWork providers review the microtasks published at the MicroWork broker, and may bid for and complete microtasks in exchange for compensation, for example, as specified by the microtask. The MicroWork customer may solicit and receive an initial report on the team of MicroWork providers that the customer has selected for the processing task. In order to determine how much to reward the reporter for the initial report, the customer may want review one or more subsequent reports.
The exemplary embodiment provides a method and system for determining a reward to optimize the value of a report on a team to the customer.