Ontology is a philosophy of what exists. In computer science ontology is used to model entities of the real world and the relations between them, so as to create common dictionaries for their discussion. Basic concepts of ontology include (i) classes of instances/things, and (ii) relations between the classes, as described hereinbelow. Ontology provides a vocabulary for talking about things that exist.
Instances/Things
There are many kinds of “things” in the world. There are physical things like a car, person, boat, screw and transistor. There are other kinds of things which are not physically connected items or not even physical at all, but may nevertheless be defined. A company, for example, is a largely imaginative thing the only physical manifestation of which is its appearance in a list at a registrar of companies. A company may own and employ. It has a defined beginning and end to its life.
Other things can be more abstract such as the Homo Sapiens species, which is a concept that does not have a beginning and end as such even if its members do.
Ontological models are used to talk about “things.” An important vocabulary tool is “relations” between things. An ontology model itself does not include the “things,” but introduces class and property symbols which can then be used as a vocabulary for talking about and classifying things.
Properties
Properties are specific associations of things with other things. Properties include:                Relations between things that are part of each other, for example, between a PC and its flat panel screen;        Relations between things that are related through a process such as the process of creating the things, for example, a book and its author;        Relations between things and their measures, for example, a thing and its weight.        
Some properties also relate things to fundamental concepts such as natural numbers or strings of characters—for example, the value of a weight in kilograms, or the name of a person.
Properties play a dual role in ontology. On the one hand, individual things are referenced by way of properties, for example, a person by his name, or a book by its title and author. On the other hand, knowledge being shared is often a property of things, too. A thing can be specified by way of some of its properties, in order to query for the values of other of its properties.
Classes
Not all properties are relevant to all things. It is convenient to discuss the source of a property as a “class” of things, also referred to as a frame or, for end-user purposes, as a category. Often sources of several properties coincide, for example, the class Book is the source for both Author and ISBN Number properties.
There is flexibility in the granularity to which classes are defined. Cars is a class. Fiat Cars can also be a class, with a restricted value of a manufacturer property. It may be unnecessary to address this class, however, since Fiat cars may not have special properties of interest that are not common to other cars. In principle, one can define classes as granular as an individual car unit, although an objective of ontology is to define classes that have important properties.
Abstract concepts such as measures, as well as media such as a body of water which cannot maintain its identity after coming into contact with other bodies of water, may be modeled as classes with a quantity property mapping them to real numbers.
In a typical mathematical model, a basic ontology comprises:                A set C, the elements of which are called “class symbols;”        For each CεC, a plain language definition of the class C;        A set P, the elements of which are called “property symbols;”        For each PεF:                    a plain language definition of P;            a class symbol called the source of P; and            a class symbol called the target of P; and                        A binary transitive reflexive anti-symmetric relation, I, called the inheritance relation on C×C.        
In the ensuing discussion, the terms “class” and “class symbol” are used interchangeably, for purposes of convenience and clarity. Similarly, the terms “property” and “property symbol” are also used interchangeably.
It is apparent to those skilled in the art that if an ontology model is extended to include sets in a class, then a classical mathematical relation on C×D can be considered as a property from C to sets in D.
If I(C1, C2) then C1 is referred to as a subclass of C2, and C2 is referred to as a superclass of C1. Also, C, is said to inherit from C2.
A distinguished universal class “Being” is typically postulated to be a superclass of all classes in C.
Variations on an ontology model may include:                Restrictions of properties to unary properties, these being the most commonly used properties;        The ability to specify more about properties, such as multiplicity and invertibility.        
The notion of a class symbol is conceptual, in that it describes a generic genus for an entire species such as Books, Cars, Companies and People. Specific instances of the species within the genus are referred to as “instances” of the class. Thus “Gone with the Wind” is an instance of a class for books, and “IBM” is an instance of a class for companies. Similarly, the notions of a property symbol is conceptual, in that it serves as a template for actual properties that operate on instances of classes.
Class symbols and property symbols are similar to object-oriented classes in computer programming, such as C++ classes. Classes, along with their members and field variables, defined within a header file, serve as templates for specific class instances used by a programmer. A compiler uses header files to allocate memory for, and enables a programmer to use instances of classes. Thus a header file can declare a rectangle class with members left, right, top and bottom. The declarations in the header file do not instantiate actual “rectangle objects,” but serve as templates for rectangles instantiated in a program. Similarly, classes of an ontology serve as templates for instances thereof.
There is, however, a distinction between C++ classes and ontology classes. In programming, classes are templates and they are instantiated to create programming objects. In ontology, classes document common structure but the instances exist in the real world and are not created through the class.
Ontology provides a vocabulary for speaking about instances, even before the instances themselves are identified. A class Book is used to say that an instance “is a Book.” A property Author allows one to create clauses “author of” about an instance. A property Siblings allows one to create statements “are siblings” about instances. Inheritance is used to say, for example, that “every Book is a PublishedWork”. Thus all vocabulary appropriate to PublishedWork can be used for Book.
Once an ontology model is available to provide a vocabulary for talking about instances, the instances themselves can be fit into the vocabulary. For each class symbol, C, all instances which satisfy “is a C” are taken to be the set of instances of C, and this set is denoted B(C). Sets of instances are consistent with inheritance, so that B(C1)⊂B(C2) whenever C1 is a subclass of C2. Property symbols with source C1 and target C2 correspond to properties with source B(C1) and target B(C2). It is noted that if class C1 inherits from class C1 then every instance of C1 is also an instance of C1 and it is therefore known already at the ontology stage that the vocabulary of C is applicable to C1.
Ontology enables creation of a model of multiple classes and a graph of properties therebetween. When a class is defined, its properties are described using handles to related classes. These can in turn be used to look up properties of the related classes, and thus properties of properties can be accessed to any depth.
Provision is made for both classes, also referred to as “simple” classes, and “complex” classes. Generally, complex classes are built up from simpler classes using tags for symbols such as intersection, Cartesian product, set, list and bag. The “intersection” tag is followed by a list of classes or complex classes. The “Cartesian product” tag is also followed by a list of classes or complex classes. The set symbol is used for describing a class comprising subsets of a class, and is followed by a single class or complex class. The list symbol is used for describing a class comprising ordered subsets of a class; namely, finite sequences, and is followed by a single class or complex class. The bag symbol is used for describing unordered finite sequences of a class, namely, subsets that can contain repeated elements, and is followed by a single class or complex class. Thus set[C] describes the class of sets of instances of a class C, list[C] describes the class of lists of instances of class C, and bag[C] describes the class of bags of instances of class C.
In terms of formal mathematics, for a set S, set[S] is P(S), the power set of S; bag[S] is NS, where N is the set of non-negative integers; and list[S] is
            ⋃              n        =        i              ∞    ⁢            S      n        .  There are natural mappings
                                          list            ⁢                                                  [            S            ]                    ⁢                      ⟶            ϕ                    ⁢                      bag            ⁢                                                  [            S            ]                    ⁢                      ⟶            ψ                    ⁢                      set            ⁢                                                  [            S            ]                          .                            (        1        )            Specifically, for a sequence (s1, s2, . . . , sn) ε list[S], φ(s1, s2, . . . , sn) is the element fεbag[S] that is the “frequency histogram” defined by f(s)=#{1≦i≦n: si=s}; and for fεbag[S], ψ(f)εset[S] is the subset of S given by the support of f, namely, supp(f)={sεS: f(s)>0}. It is noted that the composite mapping φψ maps the sequence (s1, s2, . . . , sn) into the set of its elements {s1, s2, . . . , sn}. For finite sets S, set[S] is also finite, and bag[S] and list[S] are countably infinite.
Provision is also made for one-to-one, or unary properties, and for one-to-many properties. The target of a one-to-one property is a simple class. Generally, the target of a one-to-many property is a complex class. For example, a one-to-many property named “children” may have a class Person as its source and a complex class set[Person] as its target, and a one-to-many property named “parents” may have a class Person as its source and a complex class Person×Person as its target.
A general reference on ontology systems is Sowa, John F., “Knowledge Representation,” Brooks/Cole, Pacific Grove, Calif., 2000.
Relational database schema (RDBS) are used to define templates for organizing data into tables and fields. SQL queries are used to populate tables from existing tables, generally by using table join operations. Extensible markup language (XML) schema are used to described documents for organizing data into a hierarchy of elements and attributes. XSLT script is used to generate XML documents from existing documents, generally by importing data between tags in the existing documents. XSLT was originally developed in order to generate HTML pages from XML documents.
A general reference on relation databases and SQL is the document “Oracle 9i: SQL Reference,” available on-line at http://www.oracle.com. XML, XML schema, XPath and XSLT are standards of the World-Wide Web Consortium, and are available on-line at http://www.w3.org.
Often multiple schema exist for the same source of data, and as such the data cannot readily be imported or exported from one application to another. For example, two airline companies may each run applications that process relational databases, but if the relational databases used by the two companies conform to two different schema, then neither of the companies can readily use the databases of the other company. In order for the companies to share data, it is necessary to export the databases from one schema to another.
There is thus a need for a tool that can transform data conforming to a first schema into data that conforms to a second schema.