In some data models, such as the Resource Description Framework (RDF), data is represented in the form of triples having a subject, an object, and a predicate. In this model, data is a set of logical facts, in which the subject and object are each some entity, and a predicate is a relation that exists between the entities. For example, give a list of people (the entities), predicates could be used to define family relations among the people—e.g., “Alan is a brother of Bob,” “Bob is a father of Charlie,” etc. These relations could be written in a predicate/argument format, such as that used in Prolog or Datalog—e.g., “brother(Alan,Bob)”, “Father(Bob,Charlies)”, etc. In fact, many systems that store data in this manner are coupled to, or incorporate, logic programming languages, and can perform sophisticated logical reasoning on the facts represented by the triples.
In one example, a system implements a semantic query language such as SPARQL (which stands, recursively, for “SPARQL Protocol and RDF Query Language”), which allows the client to specify declarative queries in terms of logical reasoning to be performed on the triples. A logic engine, such as a Datalog engine, may be used to carry out the reasoning. However, the actual information on which the reasoning is to be performed—e.g., the subject/predicate/object triples—may be stored in a relational database. When the client processes a semantic query such as a SPARQL query, it may issue relational queries (e.g., Structured Query language, or “SQL,” queries) to the relational database to retrieve the appropriate triples from the relational database, and then may perform logical reasoning on the retrieved triples.
Since the client includes an implementation of logic, such as a Datalog engine, logical rules may be defined, and base facts (e.g., triples that are stored in a database) may be used as part of the rule definition. For example, even if there are no underlying “uncle” triples in the database, one might define a rule such as “Uncle(A,B):-Brother(A,C); Father(C,B)”. (In other words, “A is the uncle of B, if A is the brother of C and C is the father of B.”) A client that responds to semantic queries (e.g., a SPARQL engine) could issue relational queries to obtain the underlying Brother and Father relations from the database, and could then perform the appropriate reasoning on the obtained information. However, some naïve implementations of the semantic query engine neglect certain opportunities to optimize use of the underlying relational database and its relational query processor.