1. Field of the Invention
The invention concerns information retrieval generally. More specifically, the invention concerns the use of a knowledge base to integrate multiple sources of information into one uniform view.
2. Description of the Prior Art
People have been collecting information for a long time. Even the ancient world had its libraries and archives. In our day, we have astronomical amounts of information stored in libraries, archives, and data base systems around the world. In the relative recent past, we have begun to connect the data base systems to communications networks, so that a user at a workstation anywhere in the world can quickly access information in a data base system anywhere else in the world.
Ever since people began collecting information, they have had two problems with their collections:
Keeping track of the physical locations of items of information in the collection; and PA1 Imposing some kind of conceptual organization on the information.
In the context of a traditional library, the conceptual organization is provided by the cataloging system; for example, in the Dewey decimal cataloging system, each subject area has a number and all of the books about the subject area have that number. Keeping track of the physical locations is done by giving each number in the cataloging system a place on the shelves and putting the books having the number in that place. Maps and labels on the shelves tell users of the library where to look for a book. Finding a book thus involves going to the card catalogs, looking up the subject category in the catalog to find the catalog number, and then using the map to find the shelf where books having that catalog number are stored.
As the size of a collection of information increases, it becomes more and more difficult to get from a concept to the physical location of the information. For example, many very large traditional libraries do not permit ordinary users to go to the shelves and get a book. Instead, the user looks the book up in the card catalog and writes the title, author, and catalog number on a request slip. A specialist in finding books on the shelves then goes and gets the book for the user. A major disadvantage of such a system is that it does not permit the user to look up one book on a subject in the card catalog and then go to the shelf and browse to see what else is there.
While data base systems and networks have enormously increased the accessibility of information, they have made the problems of keeping track of the physical location and imposing a conceptual organization even more difficult. Keeping track of the physical location now involves not only knowing which of the enormous number of interconnected collections of information contains the information the user wants, but also knowing what sequences of commands (or protocols) are required to access the information over the network. Imposition of a conceptual organization has also become more difficult. Unlike human librarians, computers cannot deal directly with concepts. For example, a computer is helpless with a request like "tell me everything you know about Napoleon's youth", since it has no idea either that Napoleon is a historical person or what period of time could reasonably be termed his "youth". Before the computer can do anything, the request must be broken down so that the computer searches for the right Napoleon in a historical data base instead of a cooking data base and searches over the span of time which makes up the first 21 years of that Napoleon's life.
One technique which is now being used to impose an organization is to interpose a knowledge base system between the user and the data base systems which contain the information. In this technique, the conceptual organization of the information is provided by the knowledge base. Queries involving concepts are made to the knowledge base, which translates them into the commands needed to reference the data base system. See for example European Patent Application 0 542 430 A2, Alexander Borgida and Ronald Brachman, Information Access Apparatus and Methods, published May 19, 1993.
Attempts are also being made to build information retrieval systems which not only employ knowledge bases to impose a conceptual organization, but also to access information across a network. One example of such a system is that being built by the SIMS project, described in Yigal Arens and Craig A. Knoblock, "Planning and Reformulating Queries for Semantically-modeled Multidatabase Systems", in: Proceedings of the First International Conference on Information and Knowledge Management, Baltimore, Md.,1992. Problems left unsolved by these attempts include efficient location of the relevant information sources and the manner in which the system represents its knowledge about the location of the information. It is an object of the present invention to solve these and other problems and thereby to provide more efficient and usable information access methods and apparatus.