The present invention relates to automated data analysis with the help of potentially untrained humans. In one aspect, it relates to leveraging structured feedback from untrained humans to enhance the analysis of data to find actionable insights and patterns.
Traditional data analysis suffers from certain key limitations. Such analysis is used in a wide variety of domains including Six Sigma quality improvement, fraud analytics, supply chain analytics, customer behavior analytics, social media analytics, web interaction analytics, and many others. The objective of such analytics is to find actionable underlying patterns in a set of data.
Many types of analytics involve “hypothesis testing” to confirm whether a given hypothesis such as “people buy more pizza when it is raining” is true or not. The problem with such analytics is that human experts may easily not know of a key hypothesis and thus would not know to test for it. Analysts thus primarily find what they know to look for. In our quality improvement work with Fortune 100 firms and leading outsourcing providers, we have often found cases where clear opportunities to improve a process were missed because the analysts simply did not deduce the correct hypothesis.
For example, in a medical insurance policy data-entry process, there were several cases of operators marking applicants as the wrong gender. These errors would often go undetected and only get discovered during claims processing when the system would reject cases such as pregnancy related treatment for a policy that was supposed to be for a man. The underlying pattern turned out to be that when the policy application was in Spanish, certain operators selected “Male” when they saw the word Mujer which actually means female. In three years of trying to improve this process, the analysts had not thought to test for this hypothesis and had thus not found this improvement opportunity. Sometimes analysts simply do not have the time or resources to test for all possible hypotheses and thus they select a small subset of the potential hypotheses to test. Sometimes they may manually review a small subset of data to guess which hypotheses might be the best ones to test. Sometimes they interview process owners to try to select the best hypotheses to test. Because each of these cases is subject to human error and bias, an analyst may reject key hypotheses even before testing it on the overall data. Thus, failure to detect or test for the right hypotheses is a key limitation of traditional analytics, and analysts who need not be domain experts are not very good at detecting such hypotheses.
Another limitation of traditional data analysis is the accuracy of the analysis models. Because the analysis attempts to correlate the data with one of the proposed models, it is critically important that the models accurately describe the data being analyzed. For example, one prospective model for sales of pizza might be as follows: Pizza sales are often correlated with the weather, with sporting events, or with pizza prices. However, consider a town in which the residents only buy pizza when it is both raining and there is a football game. In this situation, the model is unable to fit the data and the valuable pattern is not discovered. In one aspect of our invention, humans could recognize this pattern and provide the insight to the computer system.
A third limitation of traditional analysis is that the analysis is subject to human error. For example, many analysts conduct statistical trials using software such as SAS, STATA, or Minitab. If an analyst accidentally mistypes a number in a formula, the analysis could be completely incorrect and offer misleading conclusions. This problem is so prevalent that one leading analysis firm requires all statistical analyses to be performed by two independent analysts and the conclusions compared to detect errors. Of course, this is just one way in which humans can introduce error into the broad process of bringing data from collection to conclusion.
Finally, because humans cannot easily deal with large volumes of data or complex data, analysts often ignore variables they deem less important. Analysts may easily accidentally ignore a variable that turns out to be key. During an analysis of a credit card application process, it was found that the auditors had ignored the “Time at current address” field in their analysis as it was thought to be a relatively unimportant field. However, it turned out that this field had an exceptionally high error rate (perhaps precisely because operators also figured that the field was unimportant and thus did not pay attention to processing it correctly). Once the high error rate was factored in, this initially ignored field turned out to be a key factor in the overall analysis. Analysts also sometimes initially explore data to get a “sense of it” to help them form their hypotheses. Typically, for large datasets, analysts can only explore subsets of the overall data to detect patterns that would lead them to the right hypotheses or models. If they accidentally look at the wrong subset or fail to review a subset with the clearest patterns, they may easily miss key factors that would affect the accuracy of their analysis.
On the other hand, an emerging best practice in the world of business analytics is the practice of “crowdsourcing.” This refers to tapping a large set of people (the “crowd”) to provide insight to help solve business issues. For example, a customer might fill out a comment card indicating that a certain dress was not purchased because the customer could not find matching shoes. This can be a very valuable insight, but the traditional collection procedure suffers from several problems.
The first step in crowdsourcing is undirected social idea generation. Employees, customers, and others submit ideas and patterns that they have identified. Of course, any pattern that is not noticed by a human is not submitted and is therefore not considered in the analysis.
The next step is for someone to sort and filter all the submitted ideas. Because there are a large volume of suggestions, and it is impossible to know if the suggestions are valuable without further research, someone must make the decision on which ideas to follow up on. This can be based on how many times an idea is submitted, how much it appeals to the people sorting the suggestions, or any number of methods. The issue is that good ideas may be rejected and never investigated.
Once the selected ideas are passed to an analyst, he or she must decide how to evaluate the ideas. Research must be conducted and data collected. Sometimes the data is easily available, for example, if a customer suggests that iced tea sells better on hot days, the sales records can be correlated with weather reports. Sometimes the data must be gathered, for example, if a salesman thinks that a dress is not selling well due to a lack of matching shoes, a study can be performed where the dress is displayed with and without clearly matching shoes and the sales volumes compared. However, sometimes it is impossible to validate a theory because the corresponding data is not available.
Finally, the analysis is only as good as the analyst who performs it in the first place. An inexperienced analyst often produces much less useful results than an experienced analyst even when both work on the same data.
Thus there is a need for a solution which takes the strengths of the computer and the strengths of the humans and leverages both in a scalable manner. Such a solution could increase the effectiveness of analytics by decreasing the impact of human errors and human inability to select the correct hypotheses and models.
Further, there is a need for a scalable approach to crowdsourcing which does not suffer from the limitations of traditional crowdsourcing described above.
On the other hand, automated analysis also suffers from certain limitations. The software may not see that two different patterns detected by it are actually associated or be able to detect the underlying reason for the pattern. For example, in the policy data entry example described above, an automated analysis could detect that Spanish forms had higher error rates in the gender field but automated analysis may not be able to spot the true underlying reason. A human being however may suggest checking the errors against whether or not the corresponding operator knew Spanish. This would allow the analysis to statistically confirm that operators who do not know Spanish exhibit a disproportionately high error rate while selecting the gender for female customers (due to the Mujer=male confusion).