1. Field of Invention
This invention is related to searching for information, specifically to an architecture for searching for information contained within a computer system or within a plurality of interconnected computers. The present invention is embodied in a network architecture, a system, a method and a computer program product for searching for information contained within a plurality of interconnected computers.
2. Description of Related Art
As usage of networked computers continues to rapidly increase, more and more information is accessible at an ever-growing pace. However, such an information explosion makes it ever harder for users to find what they are looking for. In particular, certain fundamental problems exist in the current Web that make the information retrieval process lengthy, inflexible and human dependent:
Only the computers at the network edges participate in content storage and user interactions. Information retrieval paths (and thus delays) are usually long (e.g., a user has to go through several possibly remote search engines before reaching the site of interest).
Because documents generally have no semantic structures, information retrieval processes are hard to automate. Users must manually sift through the search results, many of which are irrelevant.
The commercial operational model of the search engines and portal sites dictates the lack of an open, standard interface among the competitors. This means that it is virtually impossible for users to specify customized searches (e.g., searches which aim at specific categories with special requirements on category related fields).
Given the xe2x80x9ccrawlingxe2x80x9d operation model of the search engines, the coverage of any single engine is greatly limited. It has been reported in the literature that less than thirty percent of the material available on the Web has been accessed by search engines. Further, it is extremely time-consuming, if not impossible, for these engines to respond to information updates, resulting in a large percentage of stale links in their indices.
FIG. 1 illustrates the current search operational model. By way of example, the current search operational model treats the Internet as a packet routing black box and puts most of the intelligence outside the network. A client explicitly requests an information item in the form of a Uniform Resource Location (hereinafter xe2x80x9cURLxe2x80x9d).
As shown in FIG. 1, multiple client stations are linked across the Web via routers (not shown) to various servers. The servers contain information that is desired by the client stations. However, a search engine is usually necessary for the client stations to discover the information that is contained on the servers.
Search engines and portals such as AltaVista(copyright), Yahoo(copyright), and Infoseek(copyright) are used as the major means to find information on the Web. These search engines work as indexing databases that are built through exhaustive probing of content providers. Such an operation model, although it successfully helped the Internet reach its current status, has been proven to be inadequate in discovering information. As discussed above, a more desirable search operational model should have the necessary structure and mechanisms for information routing.
Referring to FIG. 2, a conventional routing table is built upon reachability information that is relatively stable and manageable. As data packet 10 traverses through the network, it consults the routing table at each router. When the packet reaches the first router 11, the router reads the packet header, and consults its routing table 16 for the appropriate link. This process is repeated at the second router 13. The data packet 10 provides no information other than its destination, and the router nodes take no further action other than to route the data packet to its desired destination.
Generally, search engines operate by constructing a large database containing links to documents or pages present on servers connected to the Internet. To reach a particular page or document, one must follow the links from a particular starting point. The path to a particular page or document is lost if a link in the path is inoperable or otherwise removed. Building a table of reachable information at each node is impossible not only in terms of space, but in terms of time: by the time such a table is constructed, the information may already become obsolete.
If an information provider wants to inject information into the Internet, the location of that information must be registered with a search engine in order for the search engine to return to that information. The search engine, however, may overlook that particular information due to the sheer size of the Internet, or the search engine may refuse to register the information in its database for some reason.
A more flexible search operational model utilizing more effective and more efficient information discovery schemes is desired. Ideally, a new search operational model operates as follows:
Users specify to the system their interests in different levels of complexities and possibly in domain specific details.
Such specifications are then automatically directed to the relevant information locations.
The retrieved information may optionally be processed by user supplied operations such as comparison, combination or other more complicated decision making methods.
Finally, the results are sent back to the user for display.
The essence of this ideal model is a concept called xe2x80x9cinformation routing.xe2x80x9d In short, xe2x80x9cinformation routingxe2x80x9d is an extension to the traditional network routing functions with information discovery mechanisms. An xe2x80x9cinformation routingxe2x80x9d capable network thus can route not only packets with specific destination addresses, but those that only contain the specifications of their interests.
The present invention has been made in view of the above circumstances and has an object to overcome the above problems and limitations of the prior art, and has a further object to provide the capability to search a plurality of interconnected computers and the files stored thereon, and to provide a search response tailored to the search request.
Additional objects and advantages of the present invention will be set forth in part in the description that follows and in part will be obvious from the description, or may be learned by practice of the invention. The objects and advantages of the invention may be realized and attained by means of the instrumentalities and combinations particularly pointed out in the appended claims.
It is a further object of the invention to provide a method of information routing, the method comprising a step of providing a plurality of information provider nodes, wherein each node has a plurality of stored information thereon, and providing a plurality of information requestor nodes. The method further provides a plurality of active network nodes that are connected to a subset of the information provider nodes, a subset of the information requestor nodes and a subset of the active network nodes. The method further includes creating content ontology instance trees and query ontology instance trees on each of the active network nodes. The method includes injecting information content packets that contain content data and content routing data to the stored information located on the information provider node that injected the information content packet. The method further includes modifying content ontology instance trees at active network nodes traversed by information content packets by inserting content data and content routing data contained within the information content packets into a branch of the content ontology instance trees.
Next, the method further includes injecting information query packets containing query data and query routing data to the information requester node that injected the information query packet, and the method includes modifying query ontology instance tree at active network nodes traversed by information query packets by inserting query data and query routing data contained within the information query packets into a branch of the query ontology instance trees. The method also includes establishing an information route between an information requester node and an information provider node, the information route created after an information query packets searches a content ontology instance tree at an active network node, thereby reaching an information provider node, or an information content packets searches a query ontology instance tree at an active network node, thereby reaching an information requestor node.
It is a still a further object of the invention to provide a method of creating an information route in a plurality of active network nodes, wherein each active node containing content ontology instance trees and query ontology instance trees. The method includes receiving an information content packet at the active network nodes, and the information content packet includes residual content data. The method further includes updating the content ontology instance trees at the active network nodes with the residual data from the information content packet, and searching the query ontology instance trees to determine the next hop for the information content packet.
The method further includes receiving an information query packet at the active network nodes, wherein the information query packet includes residual query data, and updating query ontology instance trees at active network nodes with query data from the information query packets. The method includes searching the content ontology instance trees in order to determine the next hop for the information query packet.
The above objects are further achieved by a computer system adapted to creating an information route amongst active network nodes, each active node containing content ontology instance trees and query ontology instance trees. The computer system further includes a processor, a memory including software instructions adapted to enable the computer system to perform certain steps.
Those steps include receiving an information content packet at the active network nodes, and the information content packet includes residual content data. The steps further include updating the content ontology instance trees at the active network nodes with the residual data from the information content packet, and searching the query ontology instance trees to determine the next hop for the information content packet.
The steps further include receiving an information query packet at the active network nodes, wherein the information query packet includes residual query data, and updating query ontology instance trees at active network nodes with query data from the information query packets. The steps include searching the content ontology instance trees in order to determine the next hop for the information query packet.
The above objects are further achieved by a computer program product for enabling a computer to create an information route amongst active network nodes, each active node containing content ontology instance trees and query ontology instance trees. The steps further include software instructions for enabling the computer to perform predetermined operations, and a computer readable medium bearing the software instructions.
The predetermined operations include the receiving an information content packet at the active network nodes, and the information content packet includes residual content data. The operations further include updating the content ontology instance trees at the active network nodes with the residual data from the information content packet, and searching the query ontology instance trees to determine the next hop for the information content packet.
The operations further include receiving an information query packet at the active network nodes, wherein the information query packet includes residual query data, and updating query ontology instance trees at active network nodes with query data from the information query packets. The operations include searching the content ontology instance trees in order to determine the next hop for the information query packet.
It is a still a further object of the invention to provide a network of nodes for information searching. The network includes information provider nodes, each having a plurality of stored information, and information requestor nodes. The network further includes active network nodes, each active network node connected to a subset of the information provider nodes, a subset of the information requestor nodes and a subset of the active network nodes. The network further includes ontology instance trees, each one the active network nodes having content ontology instance trees and query ontology instance trees, wherein each content ontology instance tree contains content data from information content packets and each query ontology instance tree contains query data from information query packets.
The network further includes an active node service package at each one the active network nodes for processing commands embedded within the information query packets and the information content packets.