Successful planning, development, deployment and marketing of products and services depend heavily on having access to relevant, high quality market research data. Companies have long recognized that improving the manner in which marketing data is collected, processed, and analyzed often results in more effective delivery of the right products and services to consumers and increased revenues. Recently, companies have sought to more effectively target marketing efforts toward specific groups or individuals having certain combinations of demographic characteristics and psychographic profiles. Such highly targeted marketing efforts may provide a company a significant competitive advantage, particularly for highly competitive markets in which increased revenues are obtained primarily as a result of increased market share.
Market researchers have long dealt with the practical tradeoff between the desire to develop database information that enables companies to develop and deploy highly targeted marketing plans and the desire to develop database information that is more versatile in its application or utility. For example, a database developed from a respondent panel or survey that has been narrowly tailored to provide information related to the television viewing behaviors of a particular regional population having a particular demographic profile may be of little, if any, use when attempting to determine the fast food consumption habits of another population having that same demographic profile.
In response to the practical difficulties (e.g., the cost) associated with assembling market research panels or surveys covering multiple types of consumption activities, behaviors, preferences, etc., market researchers have employed database fusion techniques to efficiently merge or fuse database information from multiple research panels or surveys (typically two at a time) into a single database representing a single virtual population group or respondent-level panel. It is well known that the fusion of two datasets or databases into one dataset or database may enable the development of a database that reveals correlations between the consumption activities, preferences, etc. associated with two datasets or databases in a manner that the individual datasets could not. In other words, existing market research databases can be combined or fused in different ways to generate new datasets or databases that reveal respondent behaviors and/or relationships not previously revealed by the independent databases, without having to physically develop and pay for an expensive multi-purpose respondent panel or survey.
Typically, the fusion of databases or datasets involves a statistical analysis to identify a mathematical function that can be used to predict respondent usage patterns. In general, the mathematical function produced as a result of the statistical analysis is used to guide or facilitate the process of matching observations or records in the datasets or databases to be fused. In some cases, known distance function techniques are used to measure the similarities between observations or records. In other cases, the statistical analysis may process usage data using regression modeling techniques to identify those variables that are common to the databases or datasets to be fused and best suited to match observations or records.
To simplify and/or enhance a data fusion process, it is often desirable to group or segment database observations or records, each of which typically corresponds to a particular person, respondent, or household, according to a plurality of classes, or groups representing different types or levels of consumption behavior (e.g., non-consumers, low consumers, medium consumers, high consumers, etc.) By classifying, grouping, or segmenting the data to be fused, a simplified or separate fusion process can be carried out for each segment. The smaller size of the segments (in comparison to the dataset(s) the segments compose) enables the fusion process to be performed more quickly and efficiently. In addition, the data classification, grouping, or segmentation can produce better results that, for example, enable more accurate prediction of consumption behaviors.
While known fusion techniques typically rely on the use of distance functions or regression models to predict consumption behavior, the resulting predictions are not well-suited to classify or group the records or observations within datasets to be fused into discrete classes or groups. For instance, as noted above, it may be desirable to segment, classify, or group the observations or records within the datasets into classes or groups such as non-consumers, high consumers, medium consumers, and low consumers. However, regression models and distance functions are specifically adapted to predict information (e.g., usage or consumption information) that is inherently continuous in nature (e.g., dollars spent) rather than discrete such as, for example, usage classifications or groups.