The art includes a number of computerized systems and methods for calculating values representing the magnitude of change (or difference between values) based on their relative strength or significance in various applications. Three of the most common methods are the difference between two numbers, index, and Z-scores.
In an absolute value comparison, the change in value is used to rank values without any modifications to the absolute value of the change. For example, suppose that it is desirable for marketing purposes to track changes in the income of a consumer over time. If a consumer's income increases from $25,000 per year to $50,000 per year, this has an absolute value change of $25,000 during that timeframe. The use of absolute values is of limited utility in many applications, including marketing, because the absolute value of the change may or may not correlate to an event of marketing significance. In the case of the consumer whose income changed from $25,000 to $50,000, a significant change has indeed occurred, and marketing efforts should be redirected accordingly. Consider, however, the case of a consumer whose income changes from $1,025,000 to $1,050,000 over the same time period. While the absolute value of income change is the same as in the previous example, this change is likely insignificant for marketing purposes. Thus measuring the absolute value of a change, such as change in income, is not a useful measure of significance for marketing and many other purposes.
An index ranking measures the relative strength of a relationship as a percentage. Using the example above, the consumer whose income changed from $25,000 to $50,000 experienced a 100% increase in income over the time period. The consumer who experienced a change from $1,025,000 to $1,050,000 experienced only a 2% change in income. The use of an index ranking thus better captures the significance of the change in this case. The use of index ranking, however, also presents a number of drawbacks. A doubling of income may represent the same percentage change, but may have a different significance for persons earning $5,000, $50,000, or $5,000,000. Indexes contain no information about sample size, or whether the index is statistically significant. Thus insignificant changes may be ranked very highly if index ranking is used. In addition, index measures are not necessarily symmetric, that is, they not scale equally in both directions; a percentage index can increase by any amount, such as for example a 300% increase, but the most that an index percentage can decrease by is 100%.
A Z-score (also referred to as a “standard” score) measures the number of standard deviations an observed data point varies from a mean data value. The Z-score is calculated by subtracting the mean from an individual raw (absolute) score, and then dividing the difference by the standard deviation for the overall data set. Z-scores are useful for showing statistical significance, but the simple fact that a particular value has statistical significance does not necessarily mean it is predictive of behavior, which is desirable for many applications, including marketing. Also, Z-scores can only be used on sets of different samples; they cannot be used longitudinally on the same sample. They are thus of limited utility for many such applications.
Given the limitations in the various ranking systems described above for finding significance and for ranking, an improved ranking method that better identifies significance, scales equally high or low, and has positives values for increases and negative values for decreases, is desirable. In particular, it would be desirable to develop a ranking method that combines the predictive qualities of index ranking with the quality of Z-scores of only showing large values when coverage is significant.
The inventors have recognized that databases that include a measure of change may be useful in the improvement of systems that use historical data. Historical data is difficult to compile into an easy-to-use and “lightweight” data structure. This is especially true when tracking many data elements at the same time. For example, in a database containing information about a large number of households or consumers in a particular geographic area, a move, marriage, new birth, automobile purchase, and other such occurrences each create a new record with multiple details regarding each event. The result is a common problem in “big data” where there are very many records across time, to the extent that at some point the data becomes unmanageable even for more advanced and powerful computing systems. For some parties that maintain such databases, the sheer size of historical or “longitudinal” data becomes so large that it must be archived, and is thus no longer effectively used in marketing, business analytics, or other desirable applications. It would be desirable to transform this data from a set of previous state values and new state values into change values, since the result would be a much smaller footprint that would be more manageable.
References mentioned in this background section are not admitted to be prior art with respect to the present invention.