An “opinion” can represent a user preference, view, assessment, intention or attitude with respect to a particular subject matter. Generating opinion data is useful in a wide range of industries. Often, the opinions of a relatively small group of individuals can be used to make judgments about the opinions of a wider population. For example a survey may be conducted using a relatively small group of individuals about their opinions with respect to a range of products and judgments can be made as to how such products might be developed or discontinued over time for a wider population based on the responses of those individuals. Tracking public opinion about matters of general interest is of paramount importance in several disciplines, including marketing, social action and politics.
Surveys are conducted periodically in order to track the public's attitude that may modify policy decisions in the case of surveys regarding public matters, or market strategies in the case of brands and products. For example, surveys are continually conducted on a daily basis in the United States by companies like Gallup, Rasmussen and others to track the opinion of public regarding well known characters such as politicians, TV stars, sport champions, etc.
Surveys are conventionally conducted by asking questions to a randomly selected set of respondents belonging to a population. The questions are called stimuli while the answers to each respective stimuli are counted and weighted to produce meaningful statistical figures regarding the preferences, views, choices, desires, etc. of the population as a whole. The term “user selections” is used throughout this document to refer to such type of user data input in the context of statistical opinion surveys.
Such surveys produce very useful information that is used then for various purposes, such as modelling a political campaign or defining the contents of a particular broadcast. However, the accuracy of the data produced by any survey is limited by the number of respondents participating in the survey, which is the primary variable in determining the cost of such surveys.
For example, Gallup in the US produces a daily survey for tracking the public's approval rate for the president of the United States which involves approximately 1,500 respondents and its results are averaged over three days in order to smoothen out the inevitable statistical noise produced by the sampling process. While the data produced by that type of survey is critical to understand the long-term trends of the measured variable, the amount of information that can be extracted from such a survey regarding short term variations in that same variable is severely impaired by statistical noise. This noise could be overcome according to conventional approaches only by multiplying by a large factor the costs associated to the survey. This long-term constraint on conventional survey approaches regarding that type of data makes it impossible to correlate eventual drifts in the ratings with events in the news, given that no meaningful short-term data can be extracted from the surveys. The availability of very-short-term variation data would make it possible to record, measure and assess fluctuations in user selection data and its relationship with short term changes in a particular situation or set of circumstances. For example, it would make it possible to correlate swings in public opinion regarding events in the news and therefore to extract insights on how the public is assessing specific decisions made by policy makers and other individuals having a high responsibility positions or otherwise high profiles. The generation of such type of data is unfeasible with conventional surveying systems because it would require operating several successive surveys within a single day involving a significant number of individuals in order to minimize sampling errors to a level compatible with the need. This could not reasonably happen in practice. Therefore, there is a need for a system capable of detecting short-term variations in selection data provided by users, in particular relating to public attitudes and opinion, in quasi-real time for enabling a more correct interpretation of such data.
An invention is set out in the claims.
According to an aspect there is provided a method of generating statistical data representing opinion of a population comprising providing an interface for provision by a user of a data input representing said user's selection, storing data inputs provided by a plurality of users representing their respective user selections, assigning a validity time period to each data input, providing an interface for provision by a user of an update to a previous data input representing said user's selection, updating the stored data to include updates provided by one or more of the plurality of users, filtering out any such input data whose validity period has expired, aggregating the remaining stored input data for generating a value of one or more opinion indicators, and repeating said aggregating and determining steps to update the value of the at least one index over time.
Each data input provided by a user bears an input time and a validity period associated therewith. The input time and the validity period of a data input is determinative in whether a particular data input is included in the aggregation process for generating a respective instance of the opinion indicators. The aggregation process is repeated over time, either on a periodic basis or on a user controllable basis, and the succession of values of opinion indicators is used as a quasi-continuous indication of opinion variations related to the population or group to which respondents belong. If universe information is available, individual data inputs may be weighted before they are included in the aggregation to provide a better projection of the statistical data generated therefrom. The population represented may be determined based on a political or geographic border or any other suitable boundary, including interest groups or sets of individuals defined by any clustering factor.                The user selection may comprise any of: an opinion, a user preference, a view, an assessment, an intention or an attitude with respect to a particular subject matter.        The opinion indicator may be an index        The method may include generating a substantially continuous series of values of said index. Said generation of successive values of said opinion index may happen in quasi-real time.        The step of filtering out any input data whose validity period has expired enables determination of a set of valid data inputs according to the respective associated validity time periods of the inputs. It may include determining, at a point in time, whether the validity period associated with each of a plurality of data inputs has expired and excluding any data inputs for which the associated validity period has expired.        The afore-mentioned index may represent opinion data for a population, wherein said population comprises a relatively large number of individuals as compared to the number of users from whom data inputs have been used for the aggregation. The index may represent any of: agreement or disagreement with a statement; a selection of an option from a plurality of options; a prediction; an approval or disapproval of an individual, body statement or policy; a request; an expectation; or a requirement. It can take the form of any of: a numerical value; a percentage value; a Boolean choice; an alphabetical indicator; or a scaled grading.        
The method may include the step, before the aggregation step, of applying a weighting factor to the value of at least one of said data inputs. Furthermore, the length of the validity time period for a given data input may be determined by any of: the type of related subject matter; the input time of said data input, the identity of the user providing said data input, whether the data input comprises initial data representing a user selection or an update to previously-input data representing a user selection, a predetermined time limit for provision of data inputs, an arbitrary value entered by the user, or the nature or magnitude of the selection represented by the data input.
The method may include the step of determining a relationship between a value of an index and at least one event that has occurred within a predetermined time period respect to the time said variation is observed. Said relationship may be determined based on a change in the value of said at least one index over time. The method may include estimating at least a future value of said index. The step of estimating a future value of said index may comprise predictions of: a value of the index at a future point in time; a time at which a value of the index will fall below a predetermined threshold, whether a value of the index will be less than or greater than a reference value at a future point in time.
A user may be prompted to update a data input representing said user's selection.
Data inputs provided by the users representing their respective selections may be provided using a substantially continuous scale between upper and lower thresholds. That substantially continuous scale may be represented graphically to the user and the user can move a pointer or other actuator on the scale to indicate their user selection
According to an aspect a method of generating statistical data representing opinion is provided wherein that method further includes a step of analyzing variations in the succession of values of opinion indicators and determining a relationship between occurrence of an event and any such variation observed in such values.
According to another aspect a method is provided for creating a system for generating statistical data representing user selection and/or opinion, the method comprising providing an interface for provision by a user of a data input representing said user's selections, providing an interface for provision by a user of an update to a previous data input representing said user's selections, providing a memory for storing data inputs and/or data updates provided by one or more of the plurality of users and providing a processor for aggregating the stored data and generating at least one indication of opinion using said aggregated data. The aggregation may be made according to a validity period assigned to a plurality of data inputs and a data update. The aggregation step may be repeated.
According to an aspect a system is provided said system comprising a memory and a processor and being arranged to perform a method as described herein. The system may also comprise one or more user interfaces.
According to an aspect a computer readable medium is provided having computer executable instructions adapted to cause a system to perform a method substantially as described herein.
According to an aspect there is provided a system for detecting statistical variations in public opinion comprising: a Respondent Interface Subsystem for capturing respective opinion values or positions of a set of respondents in relation to a subject matter, said positions having respective validity periods associated therewith; a Database Subsystem for storing said respective positions corresponding to each respondent and to said subject matter; a Statistical Processing Subsystem for calculating successive values of a Collective Opinion Value by selecting respective subsets of said respective positions according to said respective validity periods and calculating said successive values of a Collective Opinion Value over said corresponding subsets.
Said Respondent Interface Subsystem may be accessed at any given time by said respondents for updating their respective positions. Said Statistical Processing Subsystem may filter any such valid respective position from a respective subset if any newer respective position exists in said respective subset and from the same respondent.
Said Respondent Interface Subsystem may comprise an electronic graphic representation including a description of said subject matter and an input area through which said respondents may input their respective positions to said Database Subsystem.
The respective validity periods of data inputs may be predefined according to observations made on opinion persistence times. The respective validity periods may be set by the respective respondent.
Said Statistical Processing Subsystem may include a weighting engine for correcting eventual imbalances in the set of respondents of respondents according to universe data.