Semantic data models allow relationships between resources to be modeled as facts. The facts are often represented as triples that have a subject, a predicate, and an object. For example, one triple may have the subject of “John Smith,” the predicate of “ISA,” and the object of “physician,” which may be represented as
<John Smith, ISA, physician>.
This triple represents the fact that John Smith is a physician. Other triples may be
<John Smith, graduate of, University of Washington>
representing the fact that John Smith graduated from the University of Washington and
<John Smith, degree, MD>
representing the fact that John Smith has an MD degree. Semantic data models can be used to model the relationships between any type of resources such as web pages, people, companies, products, meetings, and so on. One semantic data model, referred to as the Resource Description Framework (“RDF”), has been developed by the World Wide Web Consortium (“W3C”) to model web resources, but can be used to model any type of resource. The triples of a semantic data model may be stored in a semantic database.
Semantic data models may allow for additional facts to be inferred from the existing facts based on rules defining the inferences that may be made. For example, a rule may be that if a subject has an MD degree, then an inference can be made that the subject is a physician. This rule may be represented by an if-then-else statement as follows:
if (<?subject, degree, MD>) then <?subject, ISA, physician>.
The <?subject, degree, MD> is a condition that specifies the existing triples with a predicate of degree and an object of MD. The <?subject, ISA, physician> is the inference that can be made when an existing triple matches the condition of the rule. The “?” in “?subject” indicates that “?subject” is a variable to be given the value from the matching triple. If this rule is applied to the example triples described above, then because the fact <John Smith, degree, MD> matches the condition of the rule, the fact <John Smith, ISA, physician> can be inferred.
The rules for inferring facts need not be limited to a single condition or a single inference as in this example rule, but can have multiple conditions and multiple inferences. The following is an example of a rule with multiple conditions and multiple inferences:
if ( <?subject, degree, MD><?subject, licensed in, ?object><?object, state of, USA>)then<?subject, ISA, physician><?subject, member of, AMA><?object, licenses, physicians>.This multiple condition rule is satisfied when an existing fact matches each condition. In this example, the conditions are satisfied when a first triple has a predicate of degree and object of MD, when the subject of that triple is also in a second triple as a subject with a predicate of licensed in, and the object of the second triple is in a third triple as a subject with a predicate of state of and an object of USA. If the existing facts include:
<John Smith, degree, MD><John Smith, licensed in, Washington><Washington, state of, USA><John Smith, licensed in, Oregon><Oregon, state of, USA>then the following facts can be inferred from this rule:
<John Smith, ISA, physician><John Smith, member of, AMA><Washington, licenses, physicians><Oregon, licenses, physicians>.Since John Smith is licensed in two different states, two different sets of three triples match the conditions of the rule. The process of applying rules to existing triples is a transitive process because when an inferred fact is added to the collection additional facts may be inferred. The W3C has defined an RDF schema (“RDFS”) that can be used to define the rules for inferring facts. Examples of rules defined using RDFS are described in a paper by Goodman and Mizell (Goodman, E. and Mizell, D., “Scalable In-memory RDFS Closure on Billions of Triples,” The 6th International Workshop on Scalable Semantic Web Knowledge Base Systems, November 2010, p. 17-31), which is hereby incorporated by reference.
Current collections of triples can contain billions of triples. Because of the large size of the collections, the inferring of facts by applying rules to the triples can be computationally expensive and very time-consuming. Some attempts have been made to infer facts with a multiprocessor computer system such as the Cray XMT. The Cray XMT has a memory system that can be shared by hundreds and even thousands of multi-threaded processors. Each multi-threaded processor provides hardware support for 128 threads of execution. Aspects of the Cray XMT are described in the Goodman and Mizell paper and in U.S. Pat. No. 6,353,829, entitled “Method and System for Memory Allocation in a Multiprocessing Environment,” which is hereby incorporated by reference.