Broadly, knowledge representation is the activity of making abstract knowledge explicit, as concrete data structures, to support machine-based storage, management (e.g., information location and extraction), and reasoning systems. Conventional methods and systems exist for utilizing knowledge representations (KRs) constructed in accordance with various types of knowledge representation models, including structured controlled vocabularies such as taxonomies, thesauri and faceted classifications; formal specifications such as semantic networks and ontologies; and unstructured forms such as documents based in natural language.
A taxonomy is a KR structure that organizes categories into a hierarchical tree and associates categories with relevant objects such as physical items, documents or other digital content. Categories or concepts in taxonomies are typically organized in terms of inheritance relationships, also known as supertype-subtype relationships, generalization-specialization relationships, or parent-child relationships. In such relationships, the child category or concept has the same properties, behaviors and constraints as its parent plus one or more additional properties, behaviors or constraints. For example, the statement of knowledge, “a dog is a mammal,” can be encoded in a taxonomy by concepts/categories labeled “mammal” and “dog” linked by a parent-child hierarchical relationship. Such a representation encodes the knowledge that a dog (child concept) is a type of mammal (parent concept), but not every mammal is necessarily a dog.
A thesaurus is a KR representing terms such as search keys used for information retrieval, often encoded as single-word noun concepts. Links between terms/concepts in thesauri are typically divided into the following three types of relationships: hierarchical relationships, equivalency relationships and associative relationships. Hierarchical relationships are used to link terms that are narrower and broader in scope than each other, similar to the relationships between concepts in a taxonomy. To continue the previous example, “dog” and “mammal” are terms linked by a hierarchical relationship. Equivalency relationships link terms that can be substituted for each other as search terms, such as synonyms or near-synonyms. For example, the terms “dog” and “canine” could be linked through an equivalency relationship in some contexts. Associative relationships link related terms whose relationship is neither hierarchical nor equivalent. For example, a user searching for the term “dog” may also want to see items returned from a search for “breeder”, and an associative relationship could be encoded in the thesaurus data structure for that pair of terms.
Faceted classification is based on the principle that information has a multi-dimensional quality, and can be classified in many different ways. Subjects of an informational domain are subdivided into facets (or more simply, categories) to represent this dimensionality. The attributes of the domain are related in facet hierarchies. The objects within the domain are then described and classified based on these attributes. For example, a collection of clothing being offered for sale in a physical or web-based clothing store could be classified using a color facet, a material facet, a style facet, etc., with each facet having a number of hierarchical attributes representing different types of colors, materials, styles, etc. Faceted classification is often used in faceted search systems, for example to allow a user to search the collection of clothing by any desired ordering of facets, such as by color-then-style, by style-then-color, by material-then-color-then-style, or by any other desired prioritization of facets. Such faceted classification contrasts with classification through a taxonomy, in which the hierarchy of categories is fixed.
A semantic network is a KR that represents various types of semantic relationships between concepts using a network structure (or a data structure that encodes or instantiates a network structure). A semantic network is typically represented as a directed or undirected graph consisting of vertices representing concepts, and edges representing relationships linking pairs of concepts. An example of a semantic network is WordNet, a lexical database of the English language. Some common types of semantic relationships defined in WordNet are meronymy (A is part of B), hyponymy (A is a kind of B), synonymy (A denotes the same as B) and antonymy (A denotes the opposite of B). References to a sematic network or other KRs as being represented by a graph should be understood as indicating that a semantic network or other KR may be encoded into a data structure in a computer-readable memory or file or similar organization, wherein the structure of the data storage or the tagging of data therein serves to identify for each datum its significance to other data—e.g., whether it is intended as the value of a node or an end point of an edge or the weighting of an edge, etc.
An ontology is a KR structure encoding concepts and relationships between those concepts that is restricted to a particular domain of the real or virtual world that it is used to model. The concepts included in an ontology typically represent the particular meanings of terms as they apply to the domain being modeled or classified, and the included concept relationships typically represent the ways in which those concepts are related within the domain. For example, concepts corresponding to the word “card” could have different meanings in an ontology about the domain of poker and an ontology about the domain of computer hardware.
In general, all of the above-discussed types of KRs, as well as other conventional examples, are tools for modeling human knowledge in terms of abstract concepts and the relationships between those concepts, and for making that knowledge accessible to machines such as computers for performing various knowledge-requiring tasks. As such, human users and software developers conventionally construct KR data structures using their human knowledge, and manually encode the completed KR data structures into machine-readable form as data structures to be stored in machine memory and accessed by various machine-executed functions.