The present invention relates to data processing by digital computer, and more particularly to scalable ontology reasoning.
In recent years the development of ontologies—explicit formal specifications of the terms in the domain and relations among them—has been moving from the realm of Artificial-Intelligence laboratories to the desktops of domain experts. Ontologies have become common on the World-Wide Web. The ontologies on the Web range from large taxonomies categorizing Web sites (such as on Yahoo!) to categorizations of products for sale and their features (such as on Amazon).
An ontology defines a common vocabulary for researchers who need to share information in a domain. It includes machine-interpretable definitions of basic concepts in the domain and relations among them. Description Logic (DL) provides the theoretical foundation for semantic web ontologies (OWL). A DL ontology can be divided conceptually into three components: the Tbox, the Rbox and the Abox. The Tbox contains assertions about concepts such as subsumption (Man v Person) and equivalence (Man_MaleHuman). The Rbox contains assertions about roles and role hierarchies (hasSon v hasChild). The Abox contains role assertions between individuals (hasChild(John;Mary)) and membership assertions (John: Man). All standard reasoning tasks in expressive DL ontologies, such as query answering, reduce to consistency detection. As an example, a standard approach to testing if John is a member of the concept Man requires testing if the addition of the assertion (John: :Man) makes the Abox inconsistent using the tableau algorithm. A challenge is that consistency detection in expressive DL is well known to be intractable in the worst-case. Given that the size of an Abox may be in the order of millions of assertions, this complexity poses a serious problem for the practical use of DL ontologies, which often reside in frequently updated transactional databases. Although highly optimized DL tableau algorithms exist, they cannot be easily adapted to Aboxes in secondary storage, especially for frequently changing Aboxes. One approach that has been applied to reasoning on Aboxes in secondary storage is to convert DL to disjunctive datalog, and use deductive databases to reason over the Abox. It is desirable to provide a method and apparatus to simplify an ontology and to provide reasoning and query processing on the simplified ontology.