Web analytics is frequently performed to discover or predict useable information and to support decision making. Many businesses rely on web analytics to improve performance and/or quality of a website. For example, modern web analytic services can measure and report data associated with hundreds of variables for an online service(s). The captured data can be used to forecast or predict a metric of interest(s), such as revenue, conversions, or other measurement of website/business performance.
To improve website or business performance in relation to a particular metric, a marketer may be interested in variable predictiveness, which indicates an extent to which a variable can predict the metric. Variable predictiveness can be used to customize a website to improve or optimize a metric outcome(s). For example, when a particular variable is indicated as accurately predicting a designated metric, the digital marketer can place an emphasis in connection with that variable to customize a website for a particular user segment (e.g., males) or for a particular visitor.
In computing variable predictiveness, conventional systems generally utilize mutual information as an indicator of variable predictiveness with respect to a metric. However, calculating mutual information for an extensive number of variables may be computationally and time intensive making it difficult for a processor to generate such data in real time. For instance, to dynamically compute mutual information for thousands of variables in real time using prior approaches, a significant number of data access attempts and logarithm computations would need to be performed. Such data access attempts and computations are difficult to scale and perform in real-time, particularly when multiple queries are concurrently received. Because of the intensive computations required to calculate mutual information for each variable and metric combination in real time, prior approaches of generating variable predictiveness are generally limited to small amounts of data, fixed data ranges, and/or offline computations of predetermined queries (e.g., date ranges).