People often build models to help them plan for an uncertain future. In business, computer spreadsheets are a common tool for modeling, but with spreadsheets, people struggle to represent multiple scenarios and to represent how scenarios for any one variable, such as future profit for a company, may be related to scenarios for multiple other variables, such as future sales of multiple products like desktop PCs and notebook PCs in multiple regions like North America and Europe, and how scenarios for those variables in turn may be related to scenarios for other variables, such as the introduction of new CPUs used in those PCs.
With spreadsheets, people struggle with multiple scenarios and with relationships because adding scenarios effectively adds a dimension to analysis, and adding relationships adds more dimensions, but spreadsheets are designed for two dimensions because their formulas refer to rows and columns, so analyzing more dimensions in a spreadsheet requires adding pages as separate tabs or separate files, and that produces redundant formulas and inputs that make a model difficult to refine and update over time.
Furthermore, even if one adds a dimension for scenarios, it will be of limited use unless it includes a scenario for each of the combinations of the scenarios for all the variables in the model, with an appropriate probability for each of those scenarios.
Probabilistic graphical models are a tool designed to support creation of such a dimension. We will review their function, as well as issues that contribute to their being used much less commonly than spreadsheets.
Probabilistic Graphical Models
A probabilistic graphical model (“PGM”) is a probabilistic model for which a graph denotes the conditional dependence structure between random variables (Koller, D. & Friedman, N. (2009). Probabilistic Graphical Models: Principles and Techniques. MIT Press.) Each dependence relationship between random variables may be defined as causation (“asymmetric” or “directional”), such as in Bayesian networks and influence diagrams, or as correlation (“symmetric” or “nondirectional”), such as in Markov networks, also known as Markov random fields. PGMs are often depicted as a visualization of variables as “nodes” and relationships as “edges”, which edges appear as arrows for causation relationships between nodes (as in FIG. 1) and lines for correlation relationships between nodes (as in FIG. 2).
The purpose of a PGM is to infer a joint probability distribution of scenarios for multiple variables using assertions for each variable either about scenarios for that variable or about the relationships between scenarios for that variable and scenarios for one or more other variables. This can not only help one produce a probability distribution of scenarios for any variable for which it is difficult to assert a distribution directly, but it can also help one explore the sensitivity of that distribution to other distributions that one might assert more accurately with an investment of more time and other resources. This can be useful for many applications, including prioritizing research to support decisions under uncertainty. Probabilistic inference can be very challenging computationally, and in practice graph authors must consider tradeoffs in graph design and in choice of inference methods.
A PGM may be defined by asserting for each variable either (i) Scenarios for this variable or (ii) a “Factor” describing a deterministic or probabilistic relationship between scenarios for this variable and scenarios for one or more other variables.                (i) The assertion of scenarios can be either discrete or continuous. One example of a discrete assertion is a “Probability Table” like the one in FIG. 3, within which is asserted a probability for each discrete scenario for one variable. One example of a continuous assertion is a probability function such as “P(X)=normal (mean=a, standard deviation=r)”.        (ii) The assertion of factors can also be either discrete or continuous. One example of a discrete assertion is a “Conditional Probability Table” like the one in FIG. 4, within which is asserted a probability for each discrete scenario of one variable given discrete scenarios for one other variable. A Conditional Probability Table can also assert a probability for each discrete scenario of one variable given a combination of a discrete scenario for each of more than one other variable. One example of a continuous assertion is a conditional probability function such as “P(X|Y, Z)=normal (mean=a+b*Y+c*Z, standard deviation=r)”.        
Often, a PGM will represent multiple variables that each have different scenarios but have the same factor to other variables. To reduce the need to assert redundantly the same factor to these other variables, the PGM may combine multiple variables into one “template variable” by “indexing” the template variable along one or more “indexes” or “dimensions.” For example, instead of representing the 12 variables and 4 factors in FIG. 5, the PGM may introduce “product” and “time” indexes in order to reduce the number of assertions from 12 to 9, the number of variables from 12 to 3, and the number of factors from 4 to 1, as illustrated in FIG. 6.
Variables that have a factor with template variables can “inherit” the indexes of the template variable when its scenarios are inferred from the asserted factor and scenarios in the PGM. For example, “Company B Revenue” in FIG. 6 can inherit the product and time indexes when its factor is combined with the asserted scenarios for the template variables “Company B unit sales” and “Company B ASP”, so after probabilistic inference, it may get 4 values even though it has only 1 factor. Such “index inheritance” can cascade transitively through multiple variables, and a variable can get different indexes from different variables in its factor.
To define a PGM, one can use a “PGM authoring tool”, 70 of which were listed by Professor Kevin Murphy at the University of British Columbia (Murphy, Kevin. (2014). Software Packages for Graphical Models. Available at http://www.cs.ubc.ca/˜murphyk/Software/bnsoft.html.) These tools typically provide a graphical representation of nodes and edges, like in FIGS. 2 and 3, as well as interfaces to (i) define for each variable either scenarios or a factor to other variables, as described above, and (ii) to explore scenarios that are inferred for variables whose scenarios were not asserted explicitly. Some of these tools provide interfaces to define template variables as described above, and to utilize those template variables to perform probabilistic inference over multiple dimensions.
However, PGM authoring tools are not used nearly as commonly as spreadsheets, and that may be because all of the PGM authoring tools pose the following issues that create challenges in authoring a PGM.
Issue 1:
The tools don't provide a way to search and browse variables by their similar attributes, so when a model gets big and the nodes and arrows in FIG. 1 and FIG. 2 start looking like spaghetti, one can miss the existence of a variable and end up creating a redundant variable, not only losing the work one did on the original variable but also missing inference opportunities and creating inconsistencies.
Issue 2:
Since PGMs are garbage-in, garbage-out like any model, it can be very useful to document evidence that supports each of the assertions that define a PGM and to distinguish between assertions made with more confidence and with less confidence. But the PGM authoring tools provide little support for either. Some provide the ability to attach a “note” to each variable, but it is difficult to (i) organize detailed supporting evidence accumulated over time for each variable, and (ii) compare supporting evidence across multiple variables.
Issue 3:
One way to reduce the spaghetti in a large PGM is to enable the author to hide some variables and relationships within others and to expand those others only when ready to dig into more detail.
Some tools provide means to group variables in “modules” that hide some detail. But this approach has at least two limitations: (i) Modules are a static part of a PGM design, without a means to collapse and expand trees of variables and relationships dynamically to compare assertions across multiple variables, and (ii) each variable can only belong to one module even if it has some attributes in common with variables in multiple modules.
Issue 4:
PGM inference algorithms typically require variable identifiers that are not long and that do not have spaces, slashes, colons, apostrophes, quotation marks, and other special characters. PGM authoring tools sometimes provide users the ability to add to each variable a label that does not have these restrictions and is therefore more human-readable. However, when the tools show factors involving multiple variables, these use the identifiers instead of the labels so it is clear when each variable starts and ends, but those identifiers can make it difficult the read the factors.
Issue 5:
Template variables can reduce the spaghetti by showing only nodes and arrows for the template variables and not for all of their variations across all of their dimensions. But sometimes one would like to browse scenarios involving relationships between only some variations of various template variables, such as only those with product Desktop PC and time 2015 in FIG. 6, and these tools do not provide facile ways to do that.
Logical Graphical Models
These issues are largely related to organizing the qualitative assumptions required for people to author PGMs. Databases can help organize assumptions, so we will review opportunities and challenges in addressing these issues with the state of the art in database architectures.
Relational databases are the most commonly-used database architecture. They work well for processing transactions, using the query language SQL to organize data from columns in multiple separate tables. But organizing the assumptions in PGMs requires organizing nodes from a network of edges, which is more like Facebook's Social Graph than like a list of transactions. In a relational database architecture, this task requires “many-to-many joins”, which require creating “junction tables”, and it requires writing SQL queries that are recursive across these junction tables, making these queries complex to write and slow to execute.
A “graph database” is a less commonly-used but increasingly popular kind of “NoSQL” database architecture that uses a graph query language, like the World Wide Web Consortium (“W3C”) standard SPARQL (Prud'hommeaux, E. & Seaborne, A. (2007). SPARQL Query Language for RDF: W3C Candidate Recommendation 14 Jun. 2007. Available at http://www.w3.org/TR/rdf-sparql-query/) or the proprietary Cypher (De Marzi, M. (2012). Cypher Query Language. Chicago Graph Database Meet-Up) or GraphQL (He, H., Singh, A. (2008). Graphs-at-a-time: query language and access methods for graph databases. Proceedings of the 2008 ACM SIGMOD international conference on management of data), to traverse edges without junction tables or recursivity, enabling queries that are simple and fast for data structured as a “graph,” such as the one depicted in FIG. 7 as a visualization of nodes in black and edges in color.
The graph structure shown in FIG. 7 enables queries to traverse multiple edges in a graph to compile groups of related nodes, such as the Transactions whose product is an Electronic device. If a graph is structured more formally as a logical graphical model (“LGM”), also known as an “ontology,” then it can also enable “logical inference,” wherein relationship assertions, such as “Notebook PC is a PC” and “PC made with CPU,” enable the system to logically infer additional relationships, such as “Notebook PC made with CPU.” Then if a user changes the assertion that a computer is made with a CPU, for example, the system can automatically change the inference that a notebook PC is made with a CPU. This reduces redundant effort, which can be useful for maintaining a graph over time as the relationships between its nodes change.
Graphs may be described using the W3C standard Resource Description Framework (“RDF”) terminology (Carrol, J. & Klein, G. (2004). Resource Description Framework (RDF), Concepts and Abstract Syntax: W3C Candidate Recommendation 10 Feb. 2004. Available at http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/#dfn-property), referring to each link in a graph as a “triple” with three parts: a “subject” node, an “object” node, and a “predicate” edge linking the subject and object. For example, in the link “PC made with CPU” in the graph in FIG. 7, “PC” is the subject, “made with” is the predicate, and “CPU” is the object. Alternative terms for predicate include “arc”, “edge”, “line”, “link”, and others. Alternative terms for subject and object include “node”, “point”, “vertex”, and others.
For each node in FIG. 7, an “attribute” is the predicate and object of each triple whose subject is that node, so for a node like “Company B buys $1B of CPUs from Company A in 2010” in the graph in FIG. 7, we describe its attributes as “is a Transaction”, “time 2010”, “seller Company A”, “product CPU”, “buyer Company B”, and “revenue $1B”. In some contexts, others describe an attribute by using the term “property”.
But consistent with RDF, the term “property” is used to refer to the kind of relationship represented by each predicate. For example, the subsumption relation in set theory may be represented by a property called “is a”, and the parthood relation in mereology may be represented by a property called “made with”, so in the graph in FIG. 7, the two predicates in the two triples “PC made with CPU” and “Semiconductor made with Chip tester” are both instances of one property called “made with”.
If one supplements a graph with a formal vocabulary, such as supplementing RDF with the W3C standard OWL 2 Web Ontology Language (“OWL 2”) (Motik, B. et. al. (2009). OWL 2 Web Ontology Language Profiles: W3C Proposed Recommendation 27 Oct. 2009. Available at http://www.w3.org/TR/2009/REC-owl2-profiles-20091027/), then the graph becomes a logical graphical model (“LGM”), or an “ontology”, and certain properties like the subsumption relation enable one to “infer” additional attributes without stating them in the graph. Inferred attributes are denoted by dotted arrows in the graph in FIG. 7. For example, in that graph, one can infer that “Notebook PC” has the attribute “made with CPU” because it has the attribute “is a PC” and “PC” has the attribute “made with CPU”. If one uses the formal vocabulary to describe the “made with” property as “transitive”, then one can traverse the graph, combining successive predicates in that property to infer that “Notebook PC” also has the attribute “made with Chip tester”. And if one describes the “made with” property as “reflexive”, then it will relate every node to itself, and one can infer that “Notebook PC” also has the attribute “made with Notebook PC” Like probabilistic inference, logical inference can be very challenging computationally, and in practice graph authors must consider tradeoffs in graph design and in choice of inference methods.
A “property path” may specify a combination of different properties that may connect nodes in a graph transitively across more than one triple, and in this document, we use a colon to separate properties in a property path. For example, in the graph in FIG. 7, if the property “is a” is transitive, then the property path “product:is a” connects all three Transactions as subjects to “Asset” as object. The SPARQL 1.1 graph query language supports queries across not only properties but also property paths, using a forward slash where this document uses a colon, and the meaning is the same (Harris, S. & Seaborne, A. (2013). SPARQL 1.1 Query Language: W3C Recommendation 21 Mar. 2013. Available at http://www.w3.org/TR/2013/REC-sparq111-query-20130321/#propertypaths.)
A “cardinality” of a property for a given node may describe the number of attributes that node has with that same property. In the graph in FIG. 7, the node “PC” has cardinality 2 in the “made with” property. We refer to cardinality above 1 as “higher cardinality”.
The “arity” may describe the number of different nodes in a relationship. A triple describes a relationship between 2 nodes, so it has arity of 2 and can be described as a “binary relationship.” But one may wish to examine a relationship between more than 2 nodes, such as between the 5 nodes “2010”, “Company A”, “Company B”, “CPU”, and “$1B” in the graph in FIG. 7, because these nodes are the objects of the attributes of “Company B buys $1B of CPUs from Company A in 2010”. The relationships between more than 2 nodes may be known as “higher-arity relationships”. These higher-arity relationships can be useful for making comparisons, but they are difficult to assert concisely in a visualization of nodes and edges.
To assert logical relationships in computer-readable formats like RDF or OWL, a person can use an “authoring tool”, such as Protégé (available at http://protege.stanford.edu/) or the “tabular graph editor” described in U.S. patent application Ser. No. 14/203,472, among others. Such a tool can help one to assert these logical relationships and to explore both these asserted relationships and additional relationships inferred from them. But because LGMs support only logical inference and not probabilistic inference, these tools do not help one make assertions that any relationships are probabilistic—that they have some probability of each of numerous scenarios—nor to infer other probabilistic relationships from such assertions. So at present, they are not used for the purposes for which PGMs and PGM authoring tools are used.
Probabilistic Logic Networks
Some of these issues with PGM authoring tools could be addressed by some fascinating work on adding probabilistic assertions to logical graphical models, creating what some have termed “probabilistic logic networks”. Some examples include Probabilistic Logic (for example described at http://en.wikipedia.org/wiki/Probabilistic_logic), Probabilistic Logic Networks (described at http://en.wikipedia.org/wiki/Probabilistic_Logic_Network), Tractable Markov
Logic (described at Domingos, P. & Webb, W. (2012) A Tractable First-Order Probabilistic Logic. University of Washington), and PR-OWL (PR-OWL: A Bayesian extension to the OWL Ontology Language (http://www.pr-owl.org/).
This approach sounds promising because it can “combine the capacity of probability theory to handle uncertainty with the capacity of deductive logic to exploit structure.” However, this approach requires inference that considers together both logical and probabilistic relationships, creating performance challenges that are considerably greater than the already considerable performance challenges of logical inference and probabilistic inference each on their own. As a result, these approaches remain confined to research labs, where several of them have become dormant for several years now.