The invention relates to computer database systems and more specifically to distributed computer database systems.
A large variety of computer software tools have been developed whose purpose is to assist people in their day to day activities, such as word processors, spreadsheets, scheduling software, etc. Most of these software tools require direct interaction with a person to activate their functionality. A new class of software tool is being developed that assists people without requiring direct interaction. For example, software that reminds an individual about an appointment about to take place. Another example would be software that examines a variety of news sources, notifying the user when a relevant article has become available.
A more sophisticated example would be software that not only examines news sources, but also extracts the important features in each article. For example, such software might be watching for announcements of changes in an interest rate. Based on the relevant extracted features, the software would automatically negotiate with other software that is following the stock prices of companies in which the user has an investment, possibly resulting in a stock trade. The user would only be notified after the actions have taken place.
Software of the kind described above is called xe2x80x9cagent softwarexe2x80x9d or simply an xe2x80x9cagentxe2x80x9d. An agent is a computer executable program or software artifact whose purpose is to act as a surrogate (or assistant) to a human or another software artifact. More precisely, a software agent is a software-based computer system with the following properties:
1. Autonomy. An agent operates in the xe2x80x9cbackgroundxe2x80x9d as an independent process with control over its actions and state.
2. Socialability. An agent can interact with other agents (and humans).
3. Reactivity. An agent examines its xe2x80x9cenvironmentxe2x80x9d and responds to changes that occur in it.
4. Pro-activeness. An agent can engage in behavior directed to a goal.
One example of a software agent is an autonomous query for documents on the World Wide Web (WWW), which xe2x80x9ccrawlsxe2x80x9d around the Internet looking for documents that satisfy the query. When a document satisfying the query is found, the user is notified and the document is presented.
Agents are currently implemented as independent software systems, each of which is fully responsible for all of its functionality such as examining its environment and detecting changes that are relevant as well as finding and establishing communication with other agents. Such a design does not scale up well to millions of agents, all of whom will be independently examining the environment. For example, in a military system, one might assign an agent to represent every entity of interest in the entire theatre of operations: soldiers, weapons, targets, etc. The number of agents in such a system will be very large, yet the response time requirements of the system are severe.
An agent""s environment typically is composed of a stream of information objects, such as, e.g., images, sound or video streams, as well as traditional data objects such as text files and structured documents contained for example in facilities located on a network. As is known in the art, agents can be launched over the network, e.g., the Internet or an intranet, by a search engine to examine the network""s environment. The search engine provides each agent with a defined xe2x80x9cinterestxe2x80x9d, i.e., a set of objects, typically specified by a user""s query. The agent attempts to match its interest with the objects in the environment. The agents return any matched environment objects to the search engine, which then determines the relevance of the returned environment objects with respect to the queries. The environment that the agents are examining can be static, in which the environment constitutes a library of preexisting content, or dynamic, in which the content is a steadily changing object stream, e.g., a news feed.
Further information can be had regarding some of the concepts discussed herein with reference to the following publications:
1 L. Aiello, J. Doyle, and S. Shapiro, editors. Proc. Fifth Intern. Conf. on Principles of Knowledge Representation and Reasoning. Morgan Kaufman Publishers, San Mateo, Calif., 1996.
2 K. Baclawski. Distributed computer database system and method, December 1997. U.S. Pat. No. 5,694,593. Assigned to Northeastern University, Boston, Mass.
3 A. Del Bimbo, editor. The Ninth International Conference on Image Analysis and Processing, volume 1311. Springer, September 1997.
4 N. Fridman Noy. Knowledge Representation for Intelligent Information Retrieval in Experimental Sciences. PhD thesis, College of Computer Science, Northeastern University, Boston, Mass., 1997.
5 R. Jain. Content-centric computing in visual systems. In The Ninth International Conference on Image Analysis and Processing, Volume II, pages 1-13, September 1997.
6 Y. Ohta. Knowledge-Based Interpretation of Outdoor Natural Color Scenes. Pitman, Boston, Mass., 1985.
7 G. Salton. Automatic Text Processing. Addison-Wesley, Reading, Mass., 1989.
8 G. Salton, J. Allen, and C. Buckley. Automatic structuring and retrieval of large text files. Comm. ACM, 37(2):97-108, February 1994.
9 A. Tversky. Features of similarity. Psychological review, 84(4):327-352, July 1977.
10 M. Woodridge and N. Jennings. Intelligent agents: Theory and practice. Knowledge Egineering Review, 10(2):115-152, 1995.
The disclosures of the publications referenced in this xe2x80x9cBackground of the Inventionxe2x80x9d are incorporated herein by reference.
While conventional search engines employing agents are generally suited to their intended purposes, certain drawbacks and limitations on their operation may constrain their use in certain applications. For example, networks such as the Internet are realizing significant rates of growth in terms of both number of sites and number of users. For search engines to operate successfully in the near future, they will need to scale to accommodate that growth, perhaps employing tens of millions or more of agents. If all such agents are visiting Internet sites, it can be readily appreciated that latencies will increase and perhaps reach unacceptable levels. The same may become true for large private networks. It would be desirable to provide a search system that scales to accommodate such growth without an appreciable drop in performance.
The invention resides in a distributed computer database system connected to a network, e.g., the Internet or on an intranet, which indexes interests of agents that have registered with the system, examines information objects, for example, that reside on the network, and, responsive to a match with the registered agents"" interests, specifies to the agents the relevant information objects. Consequently, the distributed computer database system performs searches on the network, instead of the agents themselves performing individually the searches as in prior art approaches. As a result, the present invention can support very large numbers of agents seeking information from the environment (i.e., the network). Each agent must register with the system by specifying a query that determines the changes in the environment to which the agent has an interest, and the system provides the agent with information relevant to that query.
More specifically, the invention can be implemented as a distributed computer database system connected to a network, which includes an examination node for examining the network""s environment, an index node for indexing agents"" interests, and an agent node for storing agents or their locations (e.g., URLs), which collectively form a processing engine for software agents. A software agent is a computer software system that acts autonomously, perceives its environment, interacts with other agents and can engage in behavior directed toward the fulfillment of a goal. Each agent registers its interest with the processing engine. An agent""s interest is an object in the same format as the objects that are examined by the processing engine. The processing engine examines an object in its environment with examination nodes. To examine an object, the examination node begins by extracting features from the object, fragmenting the features, and hashing these fragments. Each hashed fragment is transmitted to one index node on the network. Each index node on the network that receives a hashed fragment uses the hashed fragment of the object to perform a search on its respective partition of the database. The results of the searches of the local databases are gathered by the examination node. The gathered results are used to identify those agents for which the object is relevant to the interest of the agent. If it is determined that the object is relevant to an agent, the agent is notified and given access to the object. An agent can locate other agents regarding an object in the search results or for another reason by sending an agent communication message to an examination node. The agents that are located by the examination node are then notified and can communicate with the requesting agent. In this way agents can establish communication with one or more other agents.
In another aspect of the invention, a distributed computer database system includes one or more computer nodes, including examination, indexing and agent nodes, interconnected by a network, which operate as a routing search engine. A user wishing to register an agent with the routing search engine sends a request to an examination node. The examination node assigns an agent identifier (AID) to the agent. The request for registration includes an information object that determines which objects encountered in the environment are of interest to the agent. In a manner similar to that described above, the agent""s information object is fragmented, hashed and transmitted to the index nodes, which store data relating the hashed fragment to the AID of the agent. The examination node also transmits any additional agent information, such as its location (e.g., its URL), to one of the agent nodes as determined by the AID of the agent.
For purposes of performing a search, the routing search engine according to this aspect of the invention employs the examination nodes to examine the environment. Examination nodes can include, e.g., Web xe2x80x9ccrawlers,xe2x80x9d database scanners, agent registration servers and agent communication servers. When an examination node examines an object in the environment, it extracts the features of the object. Each object feature is then fragmented into feature fragments and these are hashed. A portion of each hashed feature fragment is used by the examination node as an addressing index for identifying one of the index nodes on the network to which the examination node transmits the hashed object feature fragment. Each index node on the network that receives a hashed object feature fragment uses the hashed object feature fragment to perform a search on its respective database. Index nodes finding data corresponding to the hashed object feature return AIDs of the agents that have registered an interest in this feature with the routing search engine. Such AIDs are then gathered by the examination node and a similarity function is computed based on the features that are in common with the object as well as the features that are in the object but not registered by the agent. The similarity function is used to determine whether an agent is to be notified. The AIDs of the agents to be notified are transmitted by the examination node to the agent nodes. A portion of each AID is used by the examination node as an addressing index for identifying one of the agent nodes on the network to which the examination node transmits the AID and object information. Each agent node on the network that receives an AID and object information uses the AID to perform a search on its respective database. Nodes finding data in its database corresponding to the AID use this data to notify the agent that an object has been encountered. The agent node also transmits the object information to the agent. The agent may perform processing, e.g., formatting, of the information and may notify the user who xe2x80x9cownsxe2x80x9d the agent, e.g., who submitted the query.