The present invention relates to computerized information-retrieval (IR) systems and methods, and more particularly to computerized information-retrieval systems and methods used with textual databases.
In prior art IR systems and methods the user often has to proceed iteratively, usually using a special query language, by beginning with an initial query, scanning the results of that query (that is, the documents that were retrieved in response to that query), modifying the query if the results were unsatisfactory, and repeating these steps until satisfactory results are obtained. The user is responsible for modifying or refining the query and typically receives little or no help from the system in query construction or refinement.
Prior art systems cannot locate or highlight phrases within documents if the phrases are not found in the input query. For example, in response to a user query regarding what film pits Humphrey Bogart against gangsters in the Florida Keys, prior art IR systems would accept from the user a query containing words such as "film," "Bogart," "gangster," and "Florida Keys," and would search for the co-occurrence of these words within single documents. It would be left to the user to piece together the various documents thus retrieved to determine the correct answer.
Attempts have been made in the prior art to allow users to formulate queries in natural language. One such attempt is found in systems called question-answering systems. A question-answering system can respond to certain natural-language queries, but only because the data in its database are heavily preprocessed to accommodate such queries. For example, a question-answering system designed to respond to user queries about moon rocks will typically employ a database configured as a two-dimensional array, with moon-rock sample numbers arrayed along one dimension and fact categories, such as ore content of the rocks, arrayed along the other axis. The system responds to user queries simply by accessing the appropriate cell of the array.
Prior art IR systems and methods do not automatically perform a sequence of queries designed to include the optimal query or queries needed to answer a user's natural-language question. Furthermore, prior art IR systems and methods neither guess at the answer to the user's question nor attempt to manipulate or process queries based on such a guess. They simply match text.