The present invention relates generally to computer database systems and more particularly to a user interface which facilitates natural language communication between a user and a database by providing an easy way for a user to remove ambiguities from a natural language query of the database.
Computer database systems are being used to store and manage more and more different kinds of data. Airline reservation systems, bank account information systems, and computerized library card indices are but a few examples of such systems. Many of these systems are now becoming available for use by the general public. For example, until recently airline reservations could only be booked by trained reservation clerks, but now anyone having a home computer and a modem can have direct access to airline reservation computers. Similarly, many public libraries are placing computer terminals about the library for use by library patrons in finding books and other references.
A common problem in learning to use computers is that computers require strict compliance with a set of precise grammatical rules governing communication between the computer and the user. The SQL database language is an example of such a set of rules. This kind of precise communication is very unnatural for humans because most people are accustomed to speaking and writing in "natural language", that is, everyday spoken English which is often ungrammatical and ambiguous but which nevertheless is easily understandable by other humans.
As long as computers were used only by a relatively small number of specialists, this need to use precise rules in communicating with computers was not perceived as a major problem. Humans are perfectly capable of learning to use languages such as SQL to communicate with computers. Persons such as airline and travel agency reservation clerks were trained to use such languages to communicate with computers, and these trained specialists then took care of user-computer communication for everyone else.
As computer database systems become more widely accessible, more people wish to use them. In addition, many people are finding that they no longer have any choice--for example, libraries are abolishing the traditional card index in favor of the computerized index, and patrons of such libraries can no longer locate books except by using the computerized index system.
Most people have neither the time nor the desire to learn computer languages. Accordingly a need has arisen for computer database systems that are usable by persons having no formal training in computer database languages. Computers, in other words, must be "user friendly" to all comers, and to this end much effort is being devoted to designing computer systems that can communicate with people in "natural language", that is in ordinary human speech.
One of the problems of designing a computer system that can communicate in natural language is interpreting ambiguous queries. Consider the following natural language dialogue between a user and a computer database system:
USER: Which sales staff report to the Head Office? PA0 COMPUTER: Alecia Andrews, Bill Bronson, Carl Clemson. PA0 USER: Which sales staff visited Pinewood in April? PA0 COMPUTER: Marcia Martin, Nancy Novarro, Oscar Ottoman. PA0 USER: Which of these live in London? PA0 USER: Which sales staff report to the Head Office? PA0 COMPUTER: Alecia Andrews, Bill Bronson, Carl Clemson. PA0 USER: Which sales staff visited Pinewood in April? PA0 COMPUTER: Marcia Martin, Nancy Novarro, Oscar Ottoman. PA0 USER: Which of these live in London? PA0 COMPUTER: I cannot understand the question. "These" is ambiguous. Please rephrase your question or press "?" for help in using the system. PA0 USER: Which of those who visited Pinewood? PA0 COMPUTER: I cannot understand the question. "Those" is ambiguous. What do you want to know about "those who visited Pinewood"? Please rephrase your question or press "?" for help in using the system. PA0 USER: (exasperated) Which of the sales staff who visited Pinewood in April live in London? PA0 COMPUTER: Nancy Novarro and Oscar Ottoman.
The word "these" in the third query is ambiguous. Does "these" refer to persons who report to the head office, or to those who visited Pinewood, or to both? Using the word "these" creates a referential ambiguity, that is, an ambiguous reference to a previous item of information.
A referential ambiguity can be resolved by asking the user for more information, for example as follows:
Such a method of resolving ambiguities can be highly irritating to a user and often results in much wasted time as the user tries to formulate unambiguous questions. To avoid this problem, various attempts have been made to develop natural language processing technologies that can automatically resolve such ambiguities. However, these methods generally involve complex reasoning about the user and the information in the database and have not adequately solved the problem.
It will be apparent from the foregoing that there is a need for a user-friendly way to resolve a referential ambiguity in a natural language query of a computer database system.