1. Field of the Invention
The present invention generally relates to methods of statistical analysis, and more particularly to a method for improving the accuracy of weighted population parameters estimates computed based on responses from a sample population to a survey.
2. Background Description
Surveys are conducted to gather information which will allow an individual or corporation to make an informed decision. Many times, the information is used to gain an understanding of the beliefs and behaviors of a target population under a given set of circumstances. Responses to the survey questions, thus, provide a xe2x80x9csnapshot in timexe2x80x9d which reflects these current beliefs and behaviors.
The analysis of survey response data is particularly important in providing business services. Typically, businesses conduct surveys to determine the needs of their customers, and the underlying conditions which make their services desirable. This information is then used as a guide for improving the products or services or for offering new products or services. Surveys have also been used to capture public response to promotional messages from businesses, agencies, governments, and institutions.
Generally, it is difficult and costly to survey every member of a target population, i.e., to conduct a census. Therefore, polling organizations usually survey a subset (i.e., a representative sampling) of the population. Inferences about the beliefs or behaviors of the population are then drawn based on responses from the subset. To draw inferences about the population based on survey responses, a two-step approach is usually taken. First, a selection process or sampling methodology is used which dictates the rules by which members of the population are included in the sample. Second, an estimation process is performed to compute sample statistics which include sample estimates of population parameters.
In many cases, it is difficult to obtain a large sample of a target population. The inability to obtain a sufficiently large sample potentially translates as inaccuracies in the parameter estimates. The challenge faced by survey statisticians, therefore, is to determine how to use the limited sample data gathered to make accurate statements about the population.
Known methods for improving precision of survey estimates are limited in scope. One such method includes detecting and eliminating statistical outliers in the sample data. This serves to correct for possible errors in data collection and recording. Eliminating statistical outliers will also reduce the impact of any xe2x80x9cabnormalxe2x80x9d observations, i.e., members of the sample whose actions or beliefs do not accurately reflect overall population actions or beliefs.
Another potential source of inaccuracies in parameter estimates lies in the way in which observations are often weighted by the survey statistician. Often, a statistician will place more emphasis on the responses of some of the sample elements than others. This emphasis is realized by assigning weights of different values to the different sample responses. The weights, however, are generally estimated values, not computed according to some known and fixed rules. Consequently, inaccuracies can arise from poor estimates.
From the foregoing discussion, it is clear that there is a need for a method of performing a statistical survey analysis which generates more robust estimates of population parameters compared with conventional methods, and more particularly one where the more robust estimates are used to draw comparatively more accurate and reliable inferences of belief, behaviors, and/or trends engaged of the entire population.
It is an object of the present invention to provide a method for performing a statistical survey analysis which is more accurate and reliable than conventional methods.
It is another object of the present invention to achieve the aforementioned object by generating robust estimates of population parameters, which estimates are then used as a basis for drawing inferences relating to beliefs, behaviors, and/or trends of the population as a whole.
These and other objectives of the present invention are achieved by providing a method for analyzing statistical data which generates robust population parameters by reducing or eliminating inaccuracies caused by at least one of two factors. The first factor is statistical outliers, defined as observations that fall statistically outside of other observations in the sample. The second factor focuses on the impact of estimated weights assigned to observations relating to population parameter estimates. Both factors are undesirable because they tend to skew the accuracy of the population parameter estimates and thus the results of the survey analysis as a whole. The method of the present invention advantageously addresses both factors by (i) identifying and then eliminating statistical outliers, (ii) dampening the impact of the assigned weights so that any single weighted observed value does not unduly influence the value of the overall population parameter estimates, or (iii) both.
In accordance with a first embodiment, the method of the present invention includes identifying a survey question, obtaining a sample from a target population, collecting responses to the survey question from respondents in the sample, assigning weight values to observations corresponding to the responses, and developing a heuristic which reduces skew of an estimate of a population parameter caused by the assigned weight values by adjusting the values of the weights. An estimate of the population parameter is then computed using the adjusted weights, which estimate is a more accurate and robust estimate because the impact of the inaccuracies of the assigned weights values have been reduced or eliminated.
In accordance with a second embodiment, the method of the present invention includes identifying a survey question, obtaining a sample from a target population, collecting responses to the survey question from respondents in the sample; identifying at least one statistical outlier which may skew an estimate of a population parameter of the survey, eliminating the statistical outlier, and computing the population parameter with the statistical outlier eliminated.
In accordance with a third embodiment, the method of the present invention includes identifying a survey question, obtaining a sample from a target population, collecting sample responses to the survey question from respondents in the sample, assigning weights to observations corresponding to the responses, computing at least one population parameter estimate in accordance with steps that include a) identifying and then eliminating at least one statistical outlier which may skew the at least one population parameter estimate, and b) dampening an impact of at least one of the assigned weights on a value of the at least one population parameter estimate.
Through the embodiments of the present invention, accurate, robust population parameter estimates are obtained by eliminating statistical outliers, and/or adjusting weights assigned to each response, within a user specified range, using a heuristic such that no observation will have an unduly large impact on the overall population parameter estimate. By taking one or both of these approaches, a statistical basis is prepared from which a more accurate determination or forecast of the target population trends and behavior is obtained.