Statistical tests provide a mechanism for making quantitative conclusions about characteristics or behavior of a process as represented by a sample of data drawn from the process. Statistical tests also are used to compare characteristics or behaviors of two or more processes based on respective data sets or samples drawn from the processes.
The term “hypothesis testing” describes a broad topic within the field of statistical analysis. Hypothesis testing entails particular methodologies using statistical tests to calculate the likely validity of a claim made about a population under study based on observed data. The claim, or theory, to which the statistical testing is applied is called a “hypothesis” or “hypothesis statement”, and the data set or sample under study usually represents a sampling of data reflecting an input to, or output of, a process. A well-constructed hypothesis statement specifies a certain characteristic or parameter of the process. Typical process characteristics used in hypothesis testing include statistically meaningful parameters such as the average or mean output of a process (sometimes also referred to as the “location” of the process) and/or the dispersion/spread or variance of the process.
When constructing a hypothesis test, a hypothesis statement is defined to describe a process condition of interest that, for the purpose of the test, is alleged to be true. This initial statement is referred to as the “null hypothesis” and is often denoted algebraically by the symbol H0. Typically the null hypothesis is a logical statement describing the putative condition of a process in terms of a statistically meaningful parameter. For example, consider an example of hypothesis testing as applied to the discharge/output of a wastewater treatment process. Assume there are concerns that the process recently has changed such that the output is averaging a higher level of contaminants than the historical (and acceptable) output of 5 parts of contaminant per million (ppm). A null hypothesis based on this data could be stated as follows: the level of contaminants in the output of the process has a mean value equal to or greater than 5 ppm. The null hypothesis is stated in terms of a meaningful statistical parameter, i.e., process mean, and in terms of the process of interest, i.e., the level of contaminants in the process output.
Likewise, hypothesis testing also entails constructing an alternative hypothesis statement regarding the process behavior or condition. For the purpose of the test, the status of the alternative hypothesis statement is presumed to be uncertain, and is denoted by the symbol H1. An alternative hypothesis statement defines an uncertain condition or result in terms of the same statistical parameter as the null hypothesis, e.g., process mean, in the case of the wastewater treatment example. In that example, an alternative hypothesis statement would be defined along the following lines: the level of contaminants in the output of the process has a mean value of less than 5 ppm. In constructing null and alternative hypotheses, it is imperative that the statements be stated in terms that are mutually exclusive and exhaustive, i.e., such that there is neither overlap in possible results nor an unaccounted for or “lurking” hypothesis.
One object in applying hypothesis testing is to see if there is sufficient statistical evidence (data) to reject a presumed null hypothesis H0 in favor of an alternative hypothesis H1. Such a rejection would be appropriate under circumstances wherein the null hypothesis statement is inconsistent with the characteristics of the sampled data. In the alternative, in the event the data are not inconsistent with the statement made by the null hypothesis, then the test result is a failure to reject the null hypothesis—meaning the data sampling and testing does not provide a reason to believe any statement other than the null hypothesis. In short, application of a hypothesis test results in a statistical decision based on sampled data, and results either in a rejection of the null hypothesis H0, which leaves a conclusion in favor of the alternative H1, or a failure to reject the null hypothesis H0, which leaves a conclusion wherein the null hypothesis cannot be found false based on the sampled data.
Any Hypothesis Test can be conducted by following the four steps outlined below:
Step 1—State the null and alternative hypotheses. This step entails generating a hypothesis of interest that can be tested against an alternative hypothesis. The competing statements must be mutually exclusive and exhaustive.
Step 2—State the decision criteria. This step entails articulating the factors upon which a decision to reject or fail to reject the null hypothesis will be based. Establishing appropriate decision criteria depends on the nature of the null and alternative hypotheses and the underlying data. Typical decision criteria include a choice of a test statistic and significance level (denoted algebraically as “alpha” α) to be applied to the analysis. Many different test statistics can be used in hypothesis testing, including use of a standard or test value associated with the process data, e.g., the process mean or variance, and/or test values associated with the differences between two processes, e.g., differences between proportions/means/medians, ratios of variances and the like. The significance level reflects the degree of confidence desired when drawing conclusions based on the comparison of the test statistic to the reference statistic.
Step 3—Collect data relating to the null hypothesis and calculate the test statistic. At this step, data is collected through sampling and the relevant test statistic is calculated using the sampled data.
Step 4—State a conclusion. At this step, the appropriate test statistic is compared to its corresponding reference statistic (based on the null distribution) which shows how the test statistic would be distributed if the null hypothesis were true. Generally speaking, a conclusion can be properly drawn from the resultant value of the test statistic in one of several different ways: by comparing the test statistic to the predetermined cut-off values, which were established in Step 2; by calculating the so-called “p-value” and comparing it to the predetermine significance level α alpha; or by computing confidence intervals. The p-value is quantitative assessment of the probability of observing a value of the test statistic that is either as extreme as or more extreme than the calculated value of the test statistic, purely by random chance, under the assumption that the null hypothesis is true.