1. Field of the Invention
Embodiments of the present invention may relate to the organization of data for various applications. More particularly, embodiments of the invention may relate to the organization of objects or concepts that may be described by multi-dimensional data using ontological techniques.
2. Description of Related Art
Ontologies may be considered as being related to semantic networks in the field of artificial intelligence. Semantic networks and ontologies may be built based on concepts. A concept is a basic unit of knowledge. A concept is unambiguous.
In such structures, concepts may be connected by “links.” The most fundamental of these links may describe a generalization/specialization relationship between two concepts, and this relationship satisfies transitivity (“transitivity” refers to the well-known mathematical concept in which, for a binary relation R and elements a, b, and c, if aRb and bRc, then aRc). It has been variously called IS-A, sub-concept, subclass, a-kind-of, etc. This type of link may be used to indicate property inheritance, as in the following example.
Humans have additional “local” information about concepts. For example, solid objects have color, size, etc. We call this kind of local information “attributes”, “properties” or “slots”. If a general concept has an attribute (vehicles have a weight), then a specific sub-concept will have the same property (cars have a weight). One can conceptualize inheritance as the propagation of a property from the general concept to the more specific concept against the direction of the IS-A link.
Besides the IS-A links, ontologies may contain other links, e.g., likes, owns, connected-to, etc. These additional links may have no “built-in behavior”. These links are variously called associative relationships, roles, semantic relationships, etc., and may be labeled by their names. Such relationships are inherited down along IS-A links.
Because a concept cannot be more general than itself, and because of the transitivity of the IS-A links, there cannot be any cycles of IS-A links in a semantic network. Furthermore, it is practical to have one concept (often called THING) that is a generalization of every concept in an ontology. Thus, the concepts and IS-A links in an ontology form a hierarchy with a root. In other words, the hierarchy of an ontology may be thought of as a rooted directed acyclic graph (DAG), where the nodes represent the concepts and the links represent IS-A relationships.
The above gives rise to a representation of ontologies in the form of graphs. FIG. 1 shows an example of a graphical representation of an ontology. In this and later figures, every box stands for a concept. Bold arrows (typically pointing upwards) stand for IS-A relationships. Thin arrows will be used to stand for other relationships. The IS-A relationships in this example form a tree. Family terms, such as child, ancestor and descendant, may be used in describing ontologies. A number of other extensions exist for ontologies, such as, but not limited to, rules or axioms.
Thus, one may consider an ontology as follows. An ontology may be considered as a directed graph of nodes, which may be used to represent concepts, and edges, which may be used to represent IS-A and/or semantic relationships between pairs of nodes. Concepts may be labeled by unique terms. Concepts may have additional (name, value) pairs, called attributes, where the attribute name may be unique for each concept. The set of all concepts together with the set of all IS-A links form a rooted, connected, directed acyclic subgraph of the ontology. This subgraph may be referred to as the taxonomy of the ontology. Both attributes and semantic relationships may be inherited downwards, against the direction of the IS-A links, from more general concepts to more specific concepts.
Problems of how to organize data in a succinct, useful manner exist in many fields. One example of this is in marketing. Suppose that there is a large database of customers. One example of how this may be obtained would be by extracting information from the home pages of individual Web users. Such a database may contain demographic information and interests of each customer. This may be created, for example, by mining interest data associated with each customer. The demographic and interest information may be processed with a data mining algorithm to derive association rules between classifications of customers and interests. However, the resulting data may be in a format that does not provide useful information for a marketing professional.
Similar problematic situations may arise in other fields, for example, but not limited to, bioinformatics, computer-aided diagnosis, environmental studies, using census data, etc.