In a number of business and scientific environments, statistical information is gathered and analyzed to identify or otherwise extract useful information from often voluminous collections of data. One common application of statistical analysis is to study processes or systems to identify the impact of different categorical (non-continuous) factors on a continuous response outcome.
For example, it may be useful to study the impact different variables have on a typical commercial measure, such as average time in line for customers of a fast food restaurant. Some of the categorical factors generally expected to effect this exemplary outcome (time in line) may include: menu type, number of checkout registers, number of order takers, number of cooks, number of on-duty managers, presence of a drive-through feature in the restaurant, presence of a playground for entertaining children, location of the restaurant, etc. The results of such statistical analysis help show an analyst which elements have a significant statistical effect on the response outcome (i.e., average time in line for customer). Conversely, the analysis helps identify which elements have no statistical fact on the observed outcome. Given such analysis, further computations may develop a mathematical model to predict future outcomes based on measures of statistically significant factors.
Current techniques and systems used for such statistical analysis are time-consuming and cumbersome to use. Though a number of automated tools can assist a user in such analysis, present techniques and systems remain heavily reliant on manual aspects of the process. Such manual processes induce numerous errors due to frequent manipulation of the data through human processes. Further, human nature tends to identify shortcut solutions for complex tasks to reduce time required for the task. Often, therefore, the analysis is incomplete as user shortcuts may eliminate relevant data from the statistical analysis process.
Previous methods and systems utilized for such statistical analysis have included graphical analysis tools where selected outcome response information is plotted or graphed for each element or for interaction between various elements. Such a graphical presentation helps a user identify significant elements through visual inspection. As noted above, these graphical analysis techniques often require significant human interaction to manipulate the data into an appropriate format for the desired graph or plot. In addition, the visual inspection of data becomes cumbersome where a significant number of elements or factors may be involved. Viewing tens or hundreds of independent elements to determine relative significance of the various elements can be overwhelming for an average person.
Another common type of tool used for such statistical analysis includes single factor interaction hypothesis test analysis tools. In such tools, the outcome response for a single factor interaction is manually calculated and a hypothesis test is performed. Such a method is cumbersome where large numbers of elements or factors are involved. Each hypothesis test requires manual interaction to initiate the process and to view the resultant test output.
Still another prior technique often utilized for such statistical analysis is a so-called general linear model in which a dummy variable is established for each element or interaction among elements. The dummy variable indicates a simple, binary presence (value 1) or absence (value 0) of the specified element. A regression analysis may then be performed to determine a coefficient and to determine significance of each element or each interaction of elements. As above, such a technique is cumbersome at best where there are large numbers of factors or elements. It becomes difficult to discern useful information regarding the elements for each response and to determine whether the data is normalized or not. The method is further deficient where statistical measurements other than the mean of the outcome response are to be tested. Numerous other useful statistical measures are not feasible in such a general linear model of statistical analysis.
These various tools known to provide assisted statistical analysis often utilize commercially available statistical engines such as MINITAB. Information regarding MINITAB is readily available at, for example, MINITAB'S website.
It is evident from the above discussion that a need exists for an improved statistical analysis tool that provides, at once, flexibility in a variety of statistical analyses to be performed, ease-of-use to encourage users to perform thorough analysis, and reduced human interaction in manipulation of data to provide desired statistical analysis.