This application relates to database searches using Boolean queries and, more particularly, to a system allowing a user to examine and manipulate a Boolean query resulting from a natural language query.
Data processing systems are often used to search large collections of data known as databases. Such databases are stored, for example, on magnetic tape or disk and may be organized in many different ways. For example, some databases, known as structured databases, are organized into records, each record containing information elements or "attributes," such as a name, address, phone number, etc. In structured databases, each attribute has one of several possible values. For example, in a given record of the type discussed above, a "name" attribute may have a value of "Smith" and an "age" attribute may have a value of "32." An attribute value can also be "related" to other possible values of the attribute. For example, for the record with the "name" attribute having a value equaling the value "Smith," the value of the "age" attribute is also less than the value "1000."
Other databases, known as textual databases are designed to store and retrieve unstructured natural language texts, such as abstracts, mail messages, judicial opinions, or journal articles.
Commonly, a user accesses a textual database by typing a "Boolean query" having a format such as: EQU ("COPYING" AND "BACKUP" AND "SAVESET" AND ("V5.1" OR "V5")).
Such a query is called a Boolean query because the terms in the query are related by a Boolean relationship. Software connected with the database uses this Boolean query to search the database for all items in the database that satisfy (or "match") the query. In the above example, a database retrieval system performing a search on this query would return all articles containing all of the words "COPYING," "BACKUP," and "SAVESET," and either the word "V5.1" or the word "V5."
As the amount of storage capacity available for databases has increased, the method used to access a database and extract information has become increasingly, complex. The increasing complexity of a Boolean query often forces a user to reformulate the query before retrieving the desired information.
In addition, the greater accessibility of databases has meant that less sophisticated users are now seeking access to database information. To accommodate such users, software developers have generated user interfaces that attempt to shield the user from the complexities of formulating a Boolean query. Such systems allow a user to frame queries in a form more akin to English language statements. Systems currently in existence allow users to enter English-like queries, which are called "natural language queries." Such queries are automatically translated into Boolean queries by the system.
Although natural language queries allow unsophisticated users to access a database without having to formulate complex Boolean queries, problems sometimes arise when the system does not translate the natural language query into a useful Boolean query. Most natural language query systems are transparent to the user. A user inputs a natural language query, which the system uses to select items from the database. Because the user is unaware of the method used to convert the natural language query into a Boolean query, he does not always know whether conversion is performed correctly, completely, or adequately. In fact, the user of such systems is usually unaware of the Boolean query resulting from the original natural language query.
It would be desirable to allow a user to view how a data processing system has translated a natural language query into a Boolean query and to allow the user to reformat the Boolean query to reflect his true intentions. Such a system would still relieve the user from having to formulate a Boolean query "from scratch," and would allow the user to fine-tune a system-generated query.
It would also be desirable to display an automatically-generated Boolean query in a way that made its purpose intuitively obvious to even an unsophisticated user. Such a display might omit Boolean terms such as "AND," "OR," and "NOT" and, instead, indicate a relationship between the terms of the query graphically.