1. Technical Field
This disclosure relates to interactive data analysis. More particularly, this disclosure relates to systems and methods of providing interactive visual representations of data based on variances in subjective parameters.
2. Description of the Related Technology
The scientific method of investigation relies on testing hypotheses through experimental or observational study. Hypotheses are typically formed as mathematical models and a systematic study is conducted in order to validate or disprove the model.
Experimental and observational studies typically include multiple phases. One phase may be a data collection phase, during which quantifiable measures of a predefined set of parameters are obtained by the party conducting the study. In many cases, hypotheses are predefined, that is, they are defined prior to the data collection phase in order to avoid conscious or unconscious bias to the hypothesis from data collected. Certain protocols may be employed in order prevent bias from creeping into the data collection. These protocols typically include utilizing a blind setup in which subjects are not aware of the hypothesis or the conditions being tested, or possibly a double-blind setup in which neither the subjects nor experimenters/data collectors are aware of the hypothesis or the conditions being tested.
After data collection, a data analysis phase then utilizes the collected data in order to prove or disprove the proposed hypothesis. The data analysis phase typically relies on the use of accepted mathematical and statistical calculations which compare the data collected to models representing the hypotheses being evaluated or tested. The data analysis process typically relies on algorithmic computation of collected data and on visualization of dependencies between the experimental and/or observed variables.
During a typical data analysis phase, a number of objective criteria can influence the results of the calculations. An objective parameter is typically a fundamental or methodological attribute of the data collection or analysis, whose value would not be normally called into question by other researchers. Examples of objective criteria that may be considered include assumption of normality and the principle of least effort. In addition to the objective parameters, a number of subjective criteria or parameters are also chosen for evaluation. A subjective parameter is typically an attribute of the data analysis which is chosen ad-hoc by the researcher, where another researcher may choose a different value. One example of a subjective parameter that is commonly used to test hypotheses is the P-value, which is the probability of obtaining a result at least as extreme as a given data point, under the null hypothesis. Another example of a subjective parameter is the choice of algorithm applied to data (e.g., choosing to use maximum parsimony or maximum likelihood in deriving a phylogeny). Other examples of subjective parameters may include a choice of age group boundaries in an age study (e.g., ages 18-25, 26-40 versus ages 18-30, 31-40), or the size of an interval (e.g., one year interval, five year interval, etc.).
Ideally, subjective parameters are formulated as part of the hypothesis (or as part of the mathematical model associated with the hypothesis) prior to commencement of the data collection phase. For example, a P-value should be chosen before data collection and analysis so that the collection and analysis leads to a clear answer as to whether the hypothesis holds.
In reality, subjective parameters are commonly identified during the data collection and analysis processes. Moreover, subjective parameter values are often modified and manipulated several times during the data analysis process, leading to an iterative data analysis process through which conclusions are drawn. Existing data analysis systems and methods do not adequately account for the iterative nature of testing hypotheses in data analysis.