1. Field of the Invention
This invention relates to a method of obtaining a representative online polling sample.
2. Description of the Prior Art
In order to identify a representative random sample of the larger population in any public opinion poll, whether conducted online, by print survey or by telephone, it is necessary to eliminate as much as possible the “coverage bias” of those targeted for polling. Coverage bias is eliminated when every potential respondent in the entire population has an equal probability of being surveyed. Unless the entire population (e.g. for a national population) is approached to complete the survey or poll, it is generally considered impossible to target a group of respondents from a given population, all of whom have an equal probability of being presented with the option of completing the survey. In the context of telephone surveying, for example, substantial coverage bias creeps into any such survey since people with cellular phones are more inaccessible to the surveyor than are other potential respondents; people who work outside the home are less accessible than are other potential respondents who stay at home during the work day; furthermore, the rising number of individuals who block out telemarketing companies from reaching them by telephone are also excluded as potential respondents.
The end goal for any surveyor is to obtain a representative random sample of the population of interest (e.g. Canadians, Britons, Australians) in the final group of respondents. Given the presence of coverage bias and a number of other biases that creep into any survey that might make the non-respondent pool statistically different than the respondent pool—notably, the fact that some individuals sharing certain characteristics (gender, age, income, or psychological profile) may be more willing to answer a certain type of survey than will others—a survey that seeks to be scientific can engage in a number of approaches to reduce, but never eliminate altogether, such biases. The first and most critical method is to increase the sample size of those polled; this in turn reduces the “margin of error” of the final result, or the chance that the result observed is due to random chance rather than due to the actual data observed. Another approach is a type of multi-stage sampling or cluster sampling, where the surveyor assumes a number of variables a priori that can potentially affect the outcome, such as geographic area; the surveyor then proceeds to survey a representative number of people from one geographic area, or cluster, before moving to the next cluster or block (e.g. area code, in the case of telephone surveys). The final method is stratification: after the data have been collected, the surveyor corrects for a number of variables that could potentially skew the final results. In the stratification approach or in the cluster sampling approach, the possible criteria for which the surveyor will correct are inherently subjective; the most commonly used criteria in political polling are socioeconomic income, age, and gender. It is impossible for the surveyor to know all the possible variables that are exogenous to the question posed and which therefore require adjusting to the survey results.
The difference between cluster sampling and stratified sampling is that in cluster sampling the cluster is treated as the sampling unit so analysis is done on a population of clusters. In stratified sampling, the analysis is done on elements within strata. In stratified sampling, a random sample is drawn from each of the strata, whereas in cluster sampling only the selected clusters are studied. The main objective of cluster sampling is to reduce costs by increasing sampling efficiency; with stratified sampling, the main objective is to increase precision.
Given the many substantial challenges of obtaining a representative random sample, all forms of polling—notably online polling—have been criticized by methodologists. Online polling is especially prone to bias since there is very little randomization, if any, in the process of identifying potential survey respondents. Online respondents who, for example, take a political survey on a media website are, by definition, overly representative of people interested in that particular news media site. Accordingly, it would require the recruitment of a very large group of potential respondents online in order to assume that the respondent pool was sufficiently representative. The goal of obtaining a representative sample is therefore exceptionally challenging online. The challenge would be overcome if one could create a system where the universe of potential respondents surveyed each has an equal probability of taking the survey. Inevitably, there would still be some bias in those who actually respond to the survey if these probabilities were to be equalized—since some individuals have more time to answer a survey or may be more inclined to respond to the particular survey for whatever reason. However, if one could equalize the probability of every Internet user taking a particular survey, one would substantially increase the likelihood of obtaining a representative global sample prior to adjusting, ex post, for any additional biases or non-random effects. The number of people needing to be surveyed in order to achieve a representative random sample would drop dramatically, as would the number and complexity of the possible stratifications, or risk adjustments (e.g. for age, gender, psychological profile, etc.) to be done after the survey data have been collected. Such an invention would dramatically reduce the time and labor that companies, governments, nonprofit corporations, researchers or others would need to invest in order to conduct a scientifically valid survey online and to thereby obtain a representative random sample.