The present invention provides computational methods and systems for providing terminological precision, with regards to existing knowledge management capabilities, to model essential knowledge carried by human actors in a field of endeavor, line of business, activity or domain. The highly accurate methods and systems guarantee coherence and completeness of Semantic System of Intelligent Glossaries (SSIG) and enables the development of formal automatic interpretation of glossaries by a machine, for validating these glossaries by answering formal questions. The systems and methods as disclosed herein focus on the essential knowledge carried by human actors in a domain, whose knowledge is ultimately only accessible through the specialized language used by these humans in their day-to-day work activities. By way of example, legal documents may contain numerous idioms that the average person may not understand. The systems and methods described herein would tackle this problem by providing the user with a non-ambiguous definition for key terms, identifiers and symbols that are known within the legal field and providing them to the user in traditional jargon or language.
Several approaches to the concept of definition exist, based on mathematical, logical or data processing practices such as:                Providing a shortened but equivalent linguistic construction by mechanisms of abbreviation and acronym;        Using a meaningful mathematical or logical symbol equivalent to a term, which guarantees 100% identification of that term in any text in a natural language;        Enumerating the properties of an object or a concept to be defined;        Separating two complementary aspects of an object or a concept to be defined:                    1. “Black Box” perspective (inputs, outputs); and            2. “White Box” perspective (with exhaustive enumeration of the content of the Box).                        Studying linguistic relations like synonymy, antonymy, hyponymy and hyperonymy to build semantic nets of word meanings.        
Existing technologies including Description logics, Ontology, and Fuzzy logic are used to automate terminology. Currently, these technologies are not generally accessible and remain the province of computational linguistic specialists and researchers.
Description logics and ontology is the current state of the art. The purpose of ontology is to create a representation of the world, as well as the rules for combining representation elements to define ontological extensions. Such methods use first-order logic and set theory for knowledge modeling. For all practical purposes, Description logics defines concepts and the relations between these concepts. In this approach, what is described is necessarily a set; set inclusion is the basic mechanism for inferring knowledge and ontological extensions; concepts are modeled as subsets of elements of the universe of discourse. An Ontology classifies concepts in accordance with the inclusion relation, which is well adapted to the definition of the vocabulary of a hierarchical environment (nomenclature). Ontology is a centralized data base for sharing knowledge; but there is no formal language for solving interpretation issues between two different ontology knowledge bases. As a result, Description logics is limited given the complexity of natural languages, which can refer to a variety of concepts and documents that are produced in a dynamic and decentralized way.
Fuzzy logic uses a translation mechanism of natural language sentences into a generalized language of numerical constraints. For example, in the sentence “almost all Swedes are tall”, the word almost means 80%, while the remainder of the sentence “all Swedes are tall” is a statement, which can be formalized as a constraint. This theory is ambitious; it tackles real problems and covers a vast array of concepts, but it is still under development.
Practices currently used to address terminology include lexicons, glossaries and dictionaries. A lexicon is a list of words, symbols or identifiers dedicated to a field of endeavor. A word, symbol or identifier listed in a lexicon is called a lexical element. A glossary is a document encompassing the definitions of the lexical elements included in a lexicon. Therefore, a glossary is not a dictionary, because it does not include the definitions of all the possible meanings of the same word; on the contrary, a glossary shows only the meaning agreed upon for a specific field.
Throughout the practice of generalizing and formalizing glossaries, the Essential Knowledge dilemma arises between: size of lexicon, on the one hand, and precision of words in natural language, on the other hand; in essence: If the concept of words with multiple meanings found in natural language is fully addressed, then much knowledge on a vast number of topics can be expressed. However, it requires a massive amount of documentation that remains vague and therefore not usable by a machine; If the word meaning is restricted and specified by using a formalized language, then a very precise, focused knowledge can be expressed. However, it is ultimately purely symbolic and machine readable; but it is only understood by experts in the field and in the formal language used; moreover, this no longer provides a useful global vision of the field.
Existing knowledge management methods and technologies do not address the Essential Knowledge dilemma and many questions arise: where to stop, given the combinatorial explosion of any terminology (to define a word, it is necessary to use other words)? What is really represented with each word? How is ambiguity eliminated in the meanings? The present invention was developed to help users solve these problems and the Essential Knowledge dilemma.
The present invention applies Laws of Form (LoF), a mathematical theory created by George Spencer Brown, to lexical semantics; LoF is both a mental calculus (the Calculation of Distinctions), and a formal planar system (the Calculus of Indications). The LoF Calculation of Distinctions constrains the knowledge manager to conduct a comprehensive up-front Distinction-Based Reasoning (DBR), before writing a definition in a glossary; the LoF Calculus of Indications is used for computing formal meaningful values, i.e. the value of meaning of formal sentences, imbedding words and other lexical elements, which result from DBR analysis.
The present invention formalizes the glossary practice up to the capability of self-reference, i.e. the capability of formal self-definition, is reached; the present computer implemented method can then be re-used to formalize:                the syntax of alphabet, formulas and instructions authorized;        the application of a set of instructions to a formula;        the interpretation of the application of a set of instructions; and        the computation of the answer to a question, by interpretation of the application of a set of instructions to the question considered as a formula.        
The present invention treats words as first class citizens—i.e. as numbers or other mathematical beings—which solves the previously described:    1. limit of ontology for describing non-numerical meaning, while being fully consistent with existing definitions of numbers, sets and first order logical languages;    2. basics of terminological precision, by producing intelligent glossaries, which are formal glossaries certified in accordance with the semantic interpretation of the process itself using the self-reference capability;    3. size of lexicon and precision of words by eliminating all computable meanings (compound words, opposite words, union of words, words hyponymy, . . . ).The present invention allows automatic generation of a Minimal Lexicon from an Intelligent Glossary; such a lexicon is the smaller set of words for delimiting the field of endeavor of that glossary, which solves the Essential Knowledge dilemma.