The present invention relates to computer software, and more particularly, but not exclusively, relates to systems and methods for analyzing and correcting retail data.
The measurement of sales in retail channels can be done via a variety of methods. Initially, sample-based audits of consumer purchases at check-out were extensively utilized—but were costly and subject to significant potential inaccuracies. With the advent and accuracy improvement in scanner-based point of sale (POS) data, tracking services such as those offered by Information Resources, Inc. (IRI), and A.C. Nielsen (ACN) are able to provide highly-granular (in terms of item, venue, and time), highly-accurate measurement of sales in several retail channels—including food/grocery, drug, mass merchandise, convenience, and military commissary. These POS-based offerings can be sample-based—i.e., rely on a statistically determined subset of the target population—or census-based—i.e., use all available data from all available venues.
While POS-based measurement offerings do an excellent job of reporting “what” sold, they provide little insight into “why” something sold—since they provide no consumer-level data. To fill this need, market research companies such as IRI and ACN have recruited national consumer panels—in which panelists report their households' purchases on a regular basis. This longitudinal sample allows the development of much deeper consumer insights (e.g., brand switching, trial and repeat, etc.).
However, consumer panels are not without their problems. As with any sample-based survey, consumer panels are subject to two types of errors—i.e., sampling errors and biases—where the total error is given by the sum: (Total Error)2=(Sampling Error)2+(Bias)2.
Sampling errors are those errors attributable to the normal (random) variation that would be expected due to the fact that, by the very act of sampling, measurements are not being taken from the entire population. Sampling errors can be reduced by increasing the sample size since the standard deviation of the sampling distribution (often referred to as the “standard error”) decreases with the square root of the sample size.
Biases are systematic errors that affect any sample taken by a particular sampling method. Because these errors are systematic, they are not affected by the size of the sample. Examples of panel biases include, but are not limited to:                Recruitment bias—in which households recruited to participate in the panel are not representative of the target population (e.g., the overall population of the United States);        Self-selection bias—in which households who choose to participate in the panel have slightly different buying habits than the average household (e.g., an orientation toward using promotions or adopting new products);        Panelist turnover bias—in which the reporting effectiveness (accuracy and consistency) of panelists may vary over the time period in which they participate in the panel;        Hereditary bias—in which individuals within a household share a tendency toward certain behaviors or medical conditions;        Compliance bias—in which certain purchases or purchase occasions are consistently underreported by panelists;        Item placement bias—in which panelists report products purchased that have not been accurately captured and/or classified in the hierarchy maintained by the data collector; and        Projection bias—in which the weighting or projection system cannot fully adjust all geo-demographics or is stressed by over- or under-sampled segments of the target population.        
While both bias and sampling error are present in consumer panel data, for panels of a size significant enough to be of use in tracking consumer purchases (e.g., the IRI and ACN panels), the vast majority of the error that is present is due to bias. Further, since bias is unaffected by sample size, the negative impact of bias relative to the negative impact of sampling error worsens as the panel size increases.
The negative impact of bias is substantially larger than that of sampling error for most products. Increasing the size of the sample (i.e., the size of the panel) will reduce only the sampling error and may, in fact, worsen any bias that may be present. Given the sizes of today's consumer panels, there is limited advantage to be gained by increasing the size of the panel—since over 90% of the total error is often due to non-sampling errors (i.e., bias).
There has been little progress in the area of developing a systematic method of identifying and quantifying these biases. Further advancements are needed in this area.
Another area of concern in retail sales measurement is “coverage”. Coverage includes both the number of channels in which measurements are reported and the business usefulness of those measurements. While Information Resources, Inc.'s (IRI's) point-of-sale (POS) based services provide excellent coverage of the Food/Grocery, Drug, Mass (excluding WALMART®), Convenience, and Military channels, these channels may account for only 50% of a manufacturer's sales—and as little as 20% of its sales growth. Non-tracked, growth channels—e.g., Club, Dollar, WALMART®—are, thus, becoming an increasingly important part of manufacturers' businesses while at the same time having little data available in the way of actionable sales measurement information. Further advancements are also needed in this area.