The use of data analysis tools has increased dramatically as society has become more dependent on digital information storage. In e-commerce and other Internet and non-Internet applications, databases are generated and maintained that have astronomically large amounts of information. Such information is typically analyzed, or “mined,” to learn additional information regarding customers, users, products, etc. This information allows businesses and other users to better implement their products and/or ideas.
Data mining is typically the extraction of information from data to gain a new, insightful perspective. Data mining can employ machine learning, statistical and/or visualization techniques to discover and present knowledge in a form that is easily comprehensible to humans. Generally speaking, humans recognize or translate graphical items more easily than textual ones. Thus, larger amounts of information can be relayed utilizing this means than by other methods. As such, graphical statistical models have proven invaluable in data mining.
A Bayesian network is one type of a graphical statistical model that encodes probabilistic relationships among variables of interest. Over the last decade, the Bayesian network has become a popular representation for encoding uncertain expert knowledge in expert systems. When used in conjunction with statistical techniques, the graphical model has several advantages for data analysis. Because the model encodes dependencies among all variables, it readily handles situations where some data entries are missing. A graphical model, such as a Bayesian network, can be used to learn causal relationships, and hence can be used to gain understanding about a problem domain and to predict the consequences of intervention. Because the model has both a causal and probabilistic semantics, it is an ideal representation for combining prior knowledge (which often comes in causal form) and data. Additionally, Bayesian statistical methods in conjunction with Bayesian networks offer an efficient and principled approach for avoiding the over fitting of data.
Graphical statistical models facilitate probability theory through the utilization of graph theory. This allows for a method of dealing with uncertainty while reducing complexity. The modularity of a graphical model permits representation of complex systems by utilizing less complex elements. The connections and relationships of individual elements are identified by the probability theory, while the elements themselves are constructed by the graph theory. Utilizing graphics also provides a much more intuitive human interface to difficult problems.
Nodes of a probabilistic graphical model represent random variables. Their connectivity can indicate associative qualities such as dependence and independence and the like. If no connectivity (i.e., “arcs”) is present, this represents conditional independence assumptions, providing a representation of joint probability distributions. Graphical models can be “directed” or “undirected” depending on how they are constructed. Undirected graphical models have a more simplistic definition of independence, while directed graphical models are more complex by nature. Bayesian or “Belief” networks (BN) are included in the directed category and are utilized extensively in statistics and artificial intelligence to show causality between elements or “nodes.” They are also highly beneficial in supplying “inferences.” That is, they are able to infer information based on a posterior probability (i.e., “likelihood”) utilizing Bayes' rule. Thus, for a given outcome, its cause can be probabilistically deduced utilizing a directed graphical model.
A graphical statistical model represented by a directed acyclic graph (DAG), such as a Bayesian network, can also be applied to represent and provide predictions relating to a time series. The stochastic ARMAxp time series models are based on the well-known autoregressive, moving average (ARMA) time-series models, as represented as Bayesian networks. By understanding and modeling the persistence of time series, predictions can be made regarding future values of those time series. This proves invaluable in economics, business, and industrial arenas. Predicting behavior allows one to adjust parameters if the desired outcome is not the predicted outcome. Thus, for example, a company can predict its stock value based on current financial states and determine if they need to improve on cash reserves, sales, and/or capital investments in order to achieve a desired stock price. This also permits a study of the “influence” of various parameters on future values.
Although the usefulness of this type of modeling is substantial, sometimes determining the parameters of ARMA based time series models can prove difficult when the time series has some missing observations. Being able to efficiently compute the parameters is extremely beneficial for employment of these types of models. Without ease-of-use, the degree of difficulty may preclude their use and diminish the amount of valuable information that might otherwise be obtained from data.