In computer science and information science, an ontology is a formal naming and definition of the types, properties, and interrelationships of the entities that really or fundamentally exist for a particular domain of discourse. Ontologies are, in brief, controlled vocabularies of a specific knowledge domain such as chemical reactions, gene functions, or animal species. The domains need not be confined to the real world. The ontology consists of concepts arranged in hierarchical classes related by the relationship, such as “has members” and the inverse relationship “is a member of.” For example, the concept animal includes subclass concepts such as mammals and reptiles and insects, which themselves can each have further subclass concepts with a leaf concept that has no further member classes. Any concept can be used to describe or generate one or more individuals called instances, such as gene#1 and gene#2. Whereas ontologies define the structure of a domain, ontology annotations are statements made using the terms and relationships defined in the ontology. For example, in addition to the hierarchical classification of domain knowledge, annotations define other relationships among these concepts. For example, an individual or class of insects can have a “symbiotic” relationship with an individual or class or classes of animals. A relationship often is expressed in the form of: a subject concept| a predicate | an object concept, e.g., bacterium A | helps digestion of | organism B. (Note the ontology itself describe the “has members” relationship by the form: class A| has members| class B.) Each concept can have multiple properties, called attributes, such as geographical range, migratory pattern, hibernation, life expectancy for animal species, specified in the annotations. When coupled with descriptive logic rules governing relationships, it is possible to reason about the concepts defined in an ontology.
Common components of ontologies include: Individuals (which are instances or objects—the basic or “ground level” objects); and, Classes (which are sets, collections, concepts, classes in programming, types of objects, or kinds of things). The ontology expresses hierarchical relationships among classes and subclasses to two or more levels. Hereinafter, the term “concepts” is used interchangeably with the term “classes.” The ontology may also include, for each concept, zero or more Attributes (which are aspects, properties, features, characteristics, or parameters that individuals and classes can have). Annotations using one or more ontologies can be characterized as additional Attributes; Relationships (which are ways in which classes and individuals can be related to one another); Function terms (which are complex structures formed from certain relations that can be used in place of an individual term in a statement); Restrictions (which are formally stated descriptions of what must be true in order for some assertion to be accepted as input); Rules (which are statements in the form of an if-then or antecedent-consequent sentence that describe the logical inferences that can be drawn from an assertion in a particular form); Axioms (which are assertions, including rules, in a logical form that together comprise the overall theory that the ontology describes in its domain of application, as distinct from an axiom in formal logic); and, Events (which are changes of attributes or relations). The Web Ontology Language (OWL) is a family of knowledge representation languages for authoring ontologies built upon a W3C XML standard for objects called the Resource Description Framework (RDF).
Currently there are many shared or collective knowledge bases that are stored as ontologies, and various tools to maintain and use them. For example BioPax (Biological Pathway Exchange) is a RDF/OWL-based standard language to represent biological pathways at the molecular and cellular level. Its major use is to facilitate the exchange of pathway data. Pathway data captures our understanding of biological processes, but rapid growth due to rapid discovery of new pathways and new pathway details necessitates development of databases and computational tools to aid interpretation. Before BioPax, the fragmentation of pathway information generation across many databases with incompatible formats presented barriers to its effective use. BioPAX solves this problem by making pathway data substantially easier to collect, index, interpret and share. BioPAX can represent metabolic and signaling pathways, molecular and genetic interactions and gene regulation networks. BioPAX was created through a community process. Through BioPAX, millions of interactions organized into thousands of pathways across many organisms, from a growing number of sources, are available. Thus, large amounts of pathway data are available in a computable form to support visualization, analysis and biological discovery.