1. Field of the Invention
The present invention relates to the field of data collection and, more particularly, to a system and method for data collection interface creation and the administration of a data collection effort.
2. Description of the Related Art
A variety of circumstances exist where it is desirable to obtain data from a population sample. For example, targeted data collection from a multitude of respondents is necessary for research surveys, markup forecasts, political opinion polls, and the like. Collecting a data sample is also necessary when building a statistical model, such as a statistical model used to construct a software application program. Many software applications are designed to receive user input, convert this input into a computer understandable form, and then to execute programmatic functions based upon the converted input. Software applications having this functionality include, but are not limited to, natural language understanding (NLU) applications, automated speech recognition (ASR) applications, and Interactive Voice Response (IVR) applications. These applications require relatively large amounts of sample or test data to build a realistic user response profile.
Even though data collection efforts are common, existing methods to develop and administer a data collection effort are technologically primitive and conducted in an ad-hoc manner. One existing method generates and conveys paper data response instruments to participants using postal mail. Responses are received via postal mail and converted into a digital format, which is analyzed and otherwise processed. This method is costly, is difficult to perform rapidly, and is subject to response and transcription errors. Further, this manual data collection process fails to utilize a set of standardized data collection instruments. Consequently, each data collection effort is treated in a unique fashion, each varying in format, form, and quality.
A modification of the pure paper data collection method is to e-mail electronic versions of a data collection instrument to a group of potential respondents. Respondents can e-mail back responses, often responding by textually adding their answers to prompts. This process suffers from the same shortcomings of purely paper collection efforts, except that the cost and time of mailing instruments through postal channels are eliminated. E-mail based data collection efforts suffer from additional security shortcomings as it is relatively easy to purposefully bias a data collection instance by generating multiple responses that appear to originate from multiple e-mail recipients, but are actually generated from the same data biasing sources.
A derivative of the e-mail collection approach is to include special markers within the sent e-mail to automate the data extraction process. Unfortunately, different e-mail systems can handle these markers in a non-uniform manner; causing some e-mail systems to incorrectly display the data collection instrument; and causing other e-mail systems to modify the markers in a way that foils the automatic extraction process.
Other existing approaches use customized data collection applications, which can be word processing documents with data gathering macros, data base applications, and/or completely customized software applications. In all of these cases, developers must come up with data collection scenarios from scratch. Because of the time and design effort involved, most of the developed data collection scenarios tend to be overly simplistic. Making more sophisticated scenarios to procure more realistic data requires significant development effort, skill, and time on the part of the developer. Additionally, respondents are forced to utilize a non-standard interface that is different from other data procurement interfaces that respondents have used in the past, which can increase a non-response ratio for the data collection instance.
Regardless of the exact method used, existing data collection methodologies fail to integrate the interface creation process with data collection tools, logistics tools, and/or analysis tools. There is a need for a configurable framework for performing data collection that can be readily adapted across domains. This framework would ideally require minimal manual processing and would streamline the data collection process from interface creation to data analysis.