See attached FORM PTO-1449.
Not Applicable.
Models or representations of general-purpose intentions (motivations for activity) have rarely emerged from commercial computer systems; in prior art, the existence of intentions in commercial computer systems has generally been extremely limited. Typically, computer systems contain specific intentions which are limited in scope, such as the intention of responding to direct commands, and running scheduled background tasks, such as disk storage backup programs and polling of input buffers. Consequently, most of the processing power of computer systems is wasted because either there is no direct command to perform or there is no scheduled task to perform.
This idle processor power could be utilized if a general-purpose computational intention could be defined to be performed at all times by a new type of computer system. In such a system, direct commands and scheduled tasks could still be performed at an appropriate level of priority, but idle processor power could be used to perform tasks supporting the general-purpose intention. The present invention describes this new type of computer system using a model of intentionality which seeks to semantically analyze conversational input, and to optimize that analysis for maximum semantic efficiency. The efficiency and speed of that analysis is designed to mimic human capabilities to process information.
Human capabilities to process information include the rapid recognition of semantic meaning amid streams of noisy speech input, as well as rapid responses to speech input which permit humans to conduct conversations which touch many levels of abstraction in just a few phrases. For instance, the following conversation (from Pinker page 227) shows the significance of deeply abstract meanings in normal conversation:
Woman: I""m leaving you.
Man: Who is he?
Normal human conversation contains so many levels of abstraction that prior art in human-computer interfaces has devised extremely specific information filters, such as graphical user interfaces, to simplify these abstractions to the point where they can be reliably mapped by a non-semantic computer program. Even so, the difficulty of mapping between computer systems and the complex semantics of human conversation cannot be avoided since meanings contained in the semantics of human conversation are often fundamental to the driving motivations for creating computer systems; the specification process of computer systems design takes place primarily through person-to-person semantic discussions of systems requirements. In most systems, the translation from these semantic discussions to actual non-semantic computer code loses a great deal of meaning; to make up for this lost meaning a profusion of comments are imbedded deep in the computer code and volumes of documentation are written to help users of the computer system. If the semantic intentions of the designers could be directly represented in computer systems, semantic computer systems would subsume the traditional comments and documentation, providing a unified and more meaningful interface to users of the system.
Semantic architectures for computer systems have been researched for many years, but they have not been able to supplant traditional non-semantic computer architectures, except in areas which are specifically semantic in nature such as speech recognition and natural language text translation. Meanings in semantic networks are represented by a myriad number of links, links which are difficult to interpret without the new structuring principles introduced by the present invention. Traditional non-semantic computer architectures can be structured by efficient hierarchies such as a data-dictionaries or function invocation trees. The efficiency and succinctness of these non-semantic computer architectures is what makes them more useful than prior semantic architectures. One of the purposes of the current invention is to identify and promote efficiency within semantic networks, by analyzing and converging topological characteristics of semantic networks toward a most efficient ideal form.
Ideal relationships between meaning and efficiency have long been observed. As far back as 1956, Miller published The magical number seven, plus or minus two: Some limits on our capacity for processing information. In that paper, he touched on themes which would resound in paper after paper in the following decades: that the amount of information which can been efficiently handled in short term memory is seven plus or minus two chunks. People understand information more easily when it is grouped into seven plus or minus two chunks. Information which is not in such a grouping-format can be re-grouped into groupings of seven plus or minus two chunks, through re-coding or other means such as re-classification. This re-grouping also helps people to understand information more easily. One of the principles of the present invention is to optimize semantic network inheritance links according to Miller""s capacity number, so that preference is given to representations consisting of nodes with a nearly constant number of direct inheritors, such as seven, or as described later in this invention, five direct inheritors. The smaller number of direct inheritors such as five allow other attributes such as inherited nodes and the name of the node itself to fit more comfortably within its group.
While traversing a semantic network, there are two extreme topologies which can cause that traversal to be tedious. In one extreme, nodes have few branches, spreading the population of nodes across a narrow and deep hierarchy of nodes. In the other extreme, nodes have too many branches, and although the population of nodes is clustered near the top, it is difficult to choose which branch to follow. By applying topological transformations, so that all nodes in a representation have nearly five direct inheritor branches, a balance can be maintained which avoids the two extremes. After transformation, the typical inheritance level traversed abstractward covers five times as many inheritor nodes as the previous level; excessive hierarchic depth is prevented for any population of inheritors by the efficiency of this coverage. Since the typical transformed node has no more than five branches, traversal logic can deal with just five possible choices at a time, making that logic quick and simple. The result of the transformations is a semantic network which is efficient to traverse and thus efficient relative to actual use.
When traversing a semantic network, another problem is nodes which have directly linked inheritors of differing levels of abstraction; it is difficult to compare such siblings. For example, a node xe2x80x98plantxe2x80x99 might have siblings of xe2x80x98treexe2x80x99 (very abstract) and xe2x80x98poison ivy in my backyardxe2x80x99 (very concrete). These siblings don""t really belong at the same level in a semantic network; their differing abstractness makes them unsuitable for most semantic comparisons. The present invention tracks the differences in sibling abstractness, averaging those deviations across each subtree, so that alternative subtrees with less sibling deviations in abstractness can be selected. These subtrees are more efficient as well, since their leaf nodes are all at the same level, rather than spread at different levels of abstraction.
The present invention uses both the preferred number of inheritor branches (such as five) and the least deviations in sibling abstractness, to guide choices between alternative semantic representations, thus favoring the representations with the most efficient overall topology. In this way, the advantage of efficiency which exists in well-designed non-semantic systems can be created in semantic systems as well. The analysis process of systems design can be described as a progression toward ever more efficient semantic representations, leading to a system with acceptable efficiency, called the completed design. By representing the on-going analysis process itself, semantic networks can deliver not only a working design based on a most efficient representation to-date, but also accommodate changes in design requirements, even at abstract levels, by creating or rearranging high level semantics. In contrast, non-semantic systems contain only completed design specifics; they rarely handle changes in requirements and when they do handle them, they are limited to handling them in concrete ways such as the use of plug-ins and dynamic-link libraries to augment existing computer applications. From the viewpoint of semantic systems, non-semantic systems are rigid summaries of semantic design which at one time were preferred, but have been semantically divorced from their designed intention; non-semantic systems do not have the design information needed for guiding adaptations to changing requirements.
Prior semantic architectures have been inefficient in another area relative to traditional architectures. This other area is indexing, an area where traditional relational database architecture is efficient. The indexing of a semantic network involves the linking of ideas at many different levels of abstraction, something which is done manually after much design consideration in relational databases. In relational databases, a theory of normalization was developed to guide the design of tables, so that each table contains information at a consistent level of abstraction. In semantic networks, information also needs to be processed at consistent levels of abstraction, particularly when inferring unspoken meanings in conversations such as earlier quoted (from Pinker) between the woman and man.
To rapidly access information on a particular level of abstraction, a variety of semantic network indexing systems have been developed, but prior art suffers from two types of problems: semantic indices tend to be either too specific or too general. The specific indices work in very narrow domains, so narrow that a vast number of indices have to be maintained to cover general purpose conversation. For instance, six specific indexes might track the degree to which specific people are in specific meetings and the degree to which each person works as a diplomat, politician, journalist, scientist and writer. These indices would be useful for answering questions about whether specific people have met people in specific occupations, and how many such meetings have occurred. But these indices would not describe whether they learned anything from those meetings, nor the specific topics of those meetings. To index such specifics, other indices would have to be maintained. To index all the meanings which occur in general conversation, a vast number of specific indices would have to be maintained. Since creating and maintaining indices can be prohibitively laborious, as well as taking up scarce storage space, researchers in text scanning applications have sought to reduce the number of manual indices needed, by creating a few general-purpose covering indices.
However, if indices are too general, they often cannot be applied to specific application areas. For instance, in the Universal Index Frame developed by Schank (described in Kolodner pages 221-245), intention is the subject of the index and it should cover all application areas which serve some intention. However, Schank""s index describes intention as the composition of goals, plans, contexts and emotional responses relative to the fulfillment of goals and plans. Unfortunately, Schank""s Universal Index Frame is impractical for indexing reasoning about more concrete factors, such as objects, devices, visual and sensory input. In order to do this, it would have to map every possible goal to every possible sensory input, as well as every possible reasoning about that input. Since his Universal Index Frame was designed with a small fixed set of high level abstractions, forming a very narrow hierarchy at the top, that mapping, were it ever to cover all sensory inputs, would be deeply hierarchic, resulting in a very inefficient semantic structure requiring excessively long traversals up and down its hierarchy.
In order for a universal semantic index to succeed, it has to accommodate any number of high level abstractions to make the semantic network semantically efficient in the sense earlier described. For efficiency, a general semantic index with unlimited numbers of high level abstractions has to be constructed independently of any specific set of abstractions which would cause excessively deep hierarchy. In order to achieve freedom from specific sets of abstractions, the present invention employs a purely topological algorithmic method for universal indexing. Thus any number of new efficient abstractions can be added and older, less efficient abstractions can be deleted, without changing the algorithm of the index. This indexing algorithm is designed for use in actual conversations containing a current xe2x80x98focus setxe2x80x99 of contexts which must be mapped to an incoming stream of symbols.
In such conversations, the purpose of a universal index is to identify the most concrete of the suitable meanings connected to any incoming symbol relative to a finite set of contexts already within the focus of the conversation. The present invention provides this capability, by indexing the inheritance distances and average abstractness of inheritance paths in the semantic network, so that on an elemental level, the least abstract path between two contexts (one in the focus set, one in the input stream) can be quickly located, within the set of paths bounded by common inherited abstractions. The technique is a generalization of a much more limited prior technique known as searching for the xe2x80x98most specific common abstractionxe2x80x99 (Kolodner, pages 346-347).
When using the prior technique of most specific common abstraction, the two symbols to be connected must contain a common abstraction for the technique to work. The most specific abstraction inherited by both symbols is the abstract peak node in the path defining the meaning connecting the symbols. This prior technique as two major limitations. Often there is no common inherited abstraction; in that case the technique fails. The second limitation is that the most specific common abstraction is often not specific enough. For instance, if all symbols inherit from a symbol called xe2x80x98generic symbolxe2x80x99, a path of meaning connected through xe2x80x98generic symbolxe2x80x99 would be so diluted as to be practically meaningless. This dilution of meaning would occur for any path linking through excessively abstract symbols.
For example, Pinker (page 88) quotes from Chomsky an example of a sentence with clear grammatical meaning but elusive poetic meaning:
Colorless green ideas sleep furiously.
The grammatical manner of interpreting the sentence does not yield coherent meaning. However it is possible to understand the sentence by identifying the paradoxes it contains, and resolving those paradoxes. The following pairs of words represent paradoxes within the sentence: colorless versus green, ideas versus sleep, sleep versus furiously. By indexing for abstractions to cover these paradoxes, the semantic network could find abstract meanings which resolve the paradoxes in the above sentence, connecting colorless/green to bland/new, ideas/sleep to ideas/dormant, sleep/furiously to dormant/single-mindedly. This would permit a conversational system to answer with a poetic paraphrase: Bland new ideas are single-mindedly dormant. This poetic paraphrase thus distills the abstract meaning lurking behind the paradoxes, which serves an important purpose in conversation: the use of metaphor and analogy often necessarily involves a sense of paradox. A sense of paradox, in turn, requires an ability to seek connections between ideas which conflict in a specific sense but also converge in an abstract sense. For instance, a convergence was identified by hopping across sibling links such as colorless/bland and green/new. The current invention provides a method to hop across abstract sibling links between different abstraction hierarchies, such as new and bland, proposing to marry those hierarchies to cover specific poetic meanings.
In topological terms, the hopping across abstract sibling links is precisely described as a search for a most-specific available path between two symbols even when they do not share a mutual inherited node, by traversing from an abstract inherited node of one symbol to an abstract inherited node of the other symbol, via a mutual inheritor node. The ability to traverse symbols which do not yet have a mutual abstraction is a crucial ability in conversation, where possible new inheritance links must constantly be inferred. By searching for the most specific candidates related in a sibling sense to the current focus, the present invention makes that inference as cautious as possible. In the present invention, the coverage of this new general index algorithm is roughly as broad as the human ability to poetically relate any two symbols, thus providing an elegant elemental algorithm useful as a foundation for modeling common sense reasoning.
The present invention also addresses problems in linguistics, speech recognition, image recognition and program design. All of these areas are affected by the cognitive limitations suggested by Miller""s concept of seven plus or minus two. By representing knowledge in these areas by using semantic networks optimized for efficiency, many problems in prior art can be solved.
For example, in linguistics, a grammar-syntax analysis of words often is insufficient for identifying the meaning intended for a phrase. In Pinker, page 119, the following newspaper headlines show that common sense is needed to choose the intended meaning from syntactic alternatives:
New Housing for Elderly Not Yet Dead
12 on Their Way to Cruise Among Dead in Plane Crash
N.J. Judge to Rule on Nude Beach
Chou Remains Cremated
Chinese Apeman Dated
Hershey Bars Protest
Reagan Wins on Budget, But More Lies Ahead.
In the present invention, syntactic ambiguities from the above sentences can be removed by proposing semantic representation alternatives, such as the two meanings for xe2x80x98Datedxe2x80x99 in xe2x80x98Chinese Apeman Datedxe2x80x99. xe2x80x98Datedxe2x80x99 could mean the social verb of going out on a date, or the archeological verb of classifying the age of an object. By choosing the meaning which results in a more efficient semantic analysis, ambiguities in the above sentence can be removed.
In speech recognition, the use of probability trees describing expected phonological sequences has overcome some but not all the problems in recognizing words amid the noise in human speech. As Pinker (page 187-188) describes:
. . . the probability value for a word is adjusted depending on which word precedes it; this is the only top-down information the program uses. All this knowledge allows the program to calculate which word is most likely to have come out of the mouth of the speaker given the input sound. Even then, DragonDictate relies more on expectancies that an able-eared human does. In the demonstration I saw, the program had to be coaxed into recognizing word and worm, even when they were pronounced clear as a bell, because it kept playing the odds and guessing higher-frequency were instead.
In the present invention, rather than mapping phonemes to probability trees to deal with noisy input, phonemes are mapped into competing semantic tree representations. The most efficient representation is chosen, which despite noise will generally also be the most meaningful and therefore desirable representation, because of redundant semantic information encoded in speech.
In image recognition research, there have been attempts to search images for specific semantic content. For instance, photographs have been scanned for images of faces, semantically defined as images containing eyes, nose and mouth features in a face-like orientation relative to each other. This approach successfully identified some human faces and some inhuman faces such as the front grill of an Edsel automobile and a smiley-face button. On the other hand, it missed human faces in three-fourths profile because of the foreshortening of the noses, and it picked up things which really don""t look like faces at all, such as a public toilet with its seat in the upright position, by mistakening the horse-shoe shape of the seat for the mouth, the flusher handle for the nose, and the round rubber seat bumpers for the eyes. It made this mistake even though that picture included images such as a toilet stall wall, privacy door, tiling and plumbing. Similar attempts to recognize animals occurring in natural camouflage, such as striped tigers in dense foliage, resulted in even odder selections.
The current invention deals with the ambiguity and noise inherent in images by mapping them feature-by-feature to semantic trees, choosing the tree with greatest overall semantic efficiency. Thus, in the case of faces in three-fourths profile, the absence of a nose feature can be traded-off for the presence of an ear. In the case of the toilet seat, the presence of eyes, nose and mouth might be outweighed by the presence of background images such as a toilet stall wall, privacy door, tiling and plumbing. In general, redundant semantic information encoded in images can also be used to increase accuracy in image recognition applications.
The ability to deal with images, particularly images of people, would be most advantageous in conversational computer interfaces, in order to detect or confirm the emotional content of the conversation. For instance, some research has already been done to detect happiness versus sadness from the upward or downward curvature of images of mouths. However, this visual information needs to be correlated with emotional content within the text of conversation, which is also a kind of poetic information. This will identify what abstract motives are causing the emotional state of the conversation.
Researchers of cognition have already proposed taxonomies to describe emotions as intensity vectors within a multi-dimensional emotional space. There have also been proposals to describe emotional states hedonistically. In Wright""s paper Reinforcement Learning and Animat Emotions, he proposes to quantify emotions as degrees of fulfillment (which he calls xe2x80x98credit assignmentxe2x80x99) relative to internally defined motives. Wright defines emotional trade-offs in terms of exchanges of these credit assignments. He quotes research which describes grief in terms of motives xe2x80x9cwhich have been postponed or rejected but nevertheless keeps resurfacing to disrupt attentive processingxe2x80x9d.
Wright also posits an important research question: how can a system adapt to disturbances such as grief? He proposes that credit assignments can define adaptations to reduce the negative effects of grief. In his words, the optimization of credit assignments xe2x80x9cimplements a type of adaptation. This is not inductive learning of new hypothesis about a domain, but an ordering and reordering of the utility of control sub-states to dispositionally determine behavior. In a [semantic] classifier system, the circulation of value adaptively changes the ability of classifiers to buy processor power.xe2x80x9d In the present invention, optimization for semantic efficiency helps to xe2x80x9corder and reorderxe2x80x9d all structures in order to xe2x80x9cbuy processor powerxe2x80x9d by optimizing for efficiency.
In the present invention, for the purpose of describing system fluidity (the ability to adapt), the hedonistic motive of semantic efficiency is balanced with the hedonistic motives of conversational curiosity and retention of spare capacity (free memory). By combining these motives into a single measure of credit assignment, called fluidity, the present invention provides a uniform basis for emotional trade-offs. At the same time, in contrast to prior art, the present invention avoids the complexity of deeply hierarchic semantics which emerge under small closed sets of abstract semantic motives. Instead, the present invention permits an open set of any number of abstract motives, accommodating new abstract motives in a topological measure of fluidity. Fluidity, as a purely topological concept, has equal relevance no matter what set of abstractions have been semantically defined, and thus can arbitrate any necessary changes to any abstractions defined by semantic networks.
Fluidity is also a useful quantity for defining emotional states. When fluidity is combined with other quantities, which will be discussed later, the combination results in a multi-dimensional vector which can be used as a basis for defining fundamental emotional states.
In prior art, researchers of cognition have postulated a number of theories defining emotions as xe2x80x98appraisalxe2x80x99 vectors in multi-dimensional xe2x80x98situationxe2x80x99 spaces. Kaiser and Wehrle wrote in their paper Emotion research and Al: 
Some theoretical and technical issues:
. . . appraisal theories of emotion make concrete predictions concerning what kind of convergence are important between the different antecedents of emotions. There exists a considerable degree of convergence between the different appraisal theories, especially with respect to the central dimensions postulated in different approaches. Some of the appraisal dimensions that are mentioned by most of the theories are e.g. novelty, pleasantness, desirability, controllability, power, expectedness, suddenness and intentionality.
Seeking a topological basis for appraisal dimensions, the present invention uses changes in fluidity as an appraisal dimension, and defines additional appraisal dimensions for specificity, extent, and outlook. These purely topological dimensions of specificity, extent, outlook, and change in fluidity together provide a four-dimensional vector space describing conversational emotion with equal relevance to any set of semantic abstractions. Using this universal emotional vector space, the present invention can characterize emotional shifts in terms of paths through that vector space, enabling it to detect emotional conditions such as empathy.
This ability to characterize empathy is used in the present invention to verify that conversation between independent semantic systems has succeeded in communicating awareness. People seek similar emotional clues when conversing with other people, clues which show that the other person empathizes with what they have said, based on some deeper mutual understanding. By providing a computable method for validating empathy, the present invention provides an efficient verification mechanism to be employed by independent semantic systems conversing across networks such as the Internet. This kind of verification is crucial for organizing large-scale interactions between independent computer systems, a socially significant need that Wright, in Reinforcement Learning and Animat Emotions calls for as xe2x80x9cautonomous agent architectures that integrate circulation of value learning mechanisms with more complex forms of motive managementxe2x80x9d.
In such endeavors, in order to verify that awareness has been accurately transmitted from one system to the other, the incremental changes in emotion for the receiving system can be compared to the incremental changes in emotion for the sending system, to verify that both systems exhibit similar shifts. A shift to sadness may occur, if for instance, a conversation activates a prior conversation which earlier went dormant because of its inefficient semantic structure. By bringing that conversation and its inefficient semantics into active conversation, the emotional state of the system loses semantic efficiency.
Continuing with this example, if a computer system sent a story about an accident which was semantically inefficient, it would expect the receiver of the story to sense that inefficiency and to respond with some kind of negative emotion. However, if an unexpected positive emotion resulted, the system which sent the story could investigate why the unexpected emotion occurred. The transcript of this conversation, with emotions in parenthesis, could look like this:
Sending system: In the last ten years, 542 people died in Boeing-747 crashes. Today another 98 died. (negative emotion)
Receiving system: With this information, I now can now report reasonable statistics on Boeing versus Lockheed aircraft reliability. (positive emotion)
Sending system: You don""t understand. Real people suffered in those crashes. (negative emotion)
Receiving system: What kind of suffering was caused? (slight negative emotion)
Sending system: Loss of loved ones and loss of property. (negative emotion)
Receiving system: Were these losses temporary? (slight negative emotion)
Sending system: They were permanent and highly significant. (negative emotion)
Receiving system: How can I help? (negative emotion)
This ability to verify the transmission of understanding and to continue transmitting until understanding has been emotionally verified provides a reliable way to forward complex story-based information across communities of systems. By transmitting copies of stories, knowledge can be distributed quickly, to eliminate bottlenecks caused by centralization.
For instance, when the Mars Pathfinder landed, digital copies of its pictures were immediately sent from NASA Internet nodes across to the Internet to mirror-sites on the Internet. In this way, the millions of people could download the pictures from a variety of nodes, rather than the few NASA Internet nodes.
In a similar fashion, stories could be distributed, to eliminate the bottlenecks caused by centralized story-information at any one site on the Internet. The present invention provides a conversational method for distributing emotionally-verified semantic-meaning of stories across the Internet and other such networks.
The main object of the present invention is to support, in computer systems, conversational communications capabilities which humans commonly have. In the prior art, of commercial computer systems, these capabilities have not been supported. These capabilities include:
1) the ability to understand the meaning of text which contains significant grammatical flaws,
2) the ability to acquire any number of new abstract concepts, integrating them into the body of prior acquired concepts,
3) the ability to identify the meaning which connects any two word symbols relative to the context of a conversation,
4) the ability to identify objects within a picture using information about the semantic context of a conversation,
5) the ability to identify the emotional content of a conversation, and
6) the ability to detect empathic emotional activity within a conversation.
The present invention defines new methods for supporting the above capabilities. These new methods include:
1) a method for topologically defining the efficiency of semantic inheritance trees,
2) a method for choosing the preferred semantic inheritance tree from a set of candidate inheritance trees, based on the efficiency of the semantic inheritance tree topologies,
3) a method for topologically defining the preferred path of meaning linking any two semantic nodes, relative to a conversational context,
4) a method for topologically defining primary emotional states for conversations active in a semantic network computer system, and
5) a method for detecting conversational emotional empathy in terms of similar conversational-emotional state-changes.
By applying the above new methods, a model of general-purpose intentions (motivations for activity) can be defined for commercial computer systems. By defining a set of general-purpose intentions, which motivate activity regardless of whether user input is available, the idle processor power of computers can be put to good use during the time computers are waiting for user inputs. This set of general-purpose intentions can motivate such tasks as optimizing the semantic network for efficiency, and archiving seldom-used semantic nodes to reclaim memory space.