1. Field of the Invention
This invention relates generally to network client and server devices and programs. More particularly, the invention relates to a web browser and server-side systems that provide automated multilevel-search-related functions.
2. Description of the Related Art
The Internet enables companies and individuals to post information using an interconnected set of web pages. A first web page is connected to a second web page using a hyperlink. The HTTP protocol is used to transfer data across the Internet from one machine to another so that when a user clicks on a hyperlink in a web page, the web page referenced by the hyperlink is accessed. The action of a user clicking on a sequence of links to move from one web page to another is known as “surfing the web” or “web browsing.” Typically a client software system known as a “web browser” is used to this end. Search engines are provided to help a user find web pages related to a specific topic. Typically a search engine maintains a database of web pages and performs searches based on keywords found in the web pages. In most cases the search engine is implemented within a web server as opposed to a client device.
While web browsers are very useful, prior art web browsers are in many ways limited. For example, most browsers contain a search mechanism known as “find in page.” In the Microsoft Internet Explorer, the “find in page” feature works by having a use activate the “edit” menu and then the “Find (on this page)” submenu. Alternatively the user may select CRTL+F in order to activate the “find in page” menu. When the user selects “find in page” a dialog box appears. A user is able to enter a word into this dialog box. If the word is in the markup file currently loaded into the web browser, the word will be highlighted to the user.
While the “find in page” feature found in prior art web browsers is useful it lacks desirable functionality. For example, a current web page loaded into a browser usually includes one or more hyperlinks. If the desired word is on a page pointed to by any of the hyperlinks of the currently loaded page, the “find in page” feature would turn up a negative result. This forces the user to select each hyperlink and then perform a separate “find in page” search on each page referenced by the current page. In some cases this can be most distressing to the user, for example when the current page contains very many hyperlinks or has a richly link-nested structure. In such cases the user is forced to spend undue time searching for the desired word. It would be desirable to have a browser or a browser plug-in that would facilitate word and concept searches through multilevel document structures. Such a browser would increase the productivity of network users. It would also be desirable to allow a user to print pages linked to a given base level document or perform other page-specific functions to a nested structure of linked pages without needing to manually access and specify operations for each page.
Another area where a more powerful web browser is needed is caused by poor server systems. For example, when visiting a certain company's web site user may be presented with a search engine specific to that web site. In some cases these search engines produce a lot of results that are irrelevant. In many cases the keywords requested in the search do not even appear in the returned web pages. For example a user may run a search and receive 450 results, but only three of these are of interest. To process such an output, the user must spend undue time hunting for the three documents of interest among the 450 results. For example these 450 results are displayed 20 per results page. The user then views each results page, clicks on each link that may be of interest, and views the contents thereof. When each page is accessed, the user uses a “find-in-page” search to determine whether the keyword is actually found in the document and if so, the context of its use. It is also problematic that the user has to select a “next” link to get another page of 20 page titles to view. Sometimes it can take tens of seconds or more to access the next set of titles. It would be desirable to have a web browser with built-in intelligence that could automate the process of wading through irrelevant search results. Other related client-side search acceleration techniques are also needed to allow a user control over searching instead of being limited by the abilities of a distant server's search facilities.
The concept of metadata aids in the computer manipulation of web-based information such as web sites. XML (extensible markup language) uses metadata to describe the content of resources such as web pages. A dialect of XML is called RDF (resource description framework). RDF supports the notion of a pointer so that graphs may be represented in a computer readable format. For example, a web site may be described using an RDF description. The RDF description can be written to contain a set of searchable properties to describe each web page in the web site hierarchy. The properties of each RDF-described web page can include a pointer to a page pointed to by that page. In this way a computer manipulatable site map representation may be maintained by the a server. When this information is made available to another computer, the other computer can then read the site map and perform operations thereon. This can improve the ability of a search engine to search a site, for example.
Another technology of interest is remote object technology. Remote object technology is defined form use with object-oriented programming environments. One example of an object oriented programming environment is the Java™ from Sun Microsystems Inc. Java™ defines objects that execute over a “virtual machine.” The virtual machine is a software model of a generic machine. A Java™ object is compiled into a virtual machine language known as “bytecodes.” The bytecodes are then translated by the virtual machine (or a just-in-time compiler) to run on the native processor of a target platform. An example of a distributed object technology is RMI™ (remote method invocation). RMI™ is a Java™ based distributed object technology. Another example of a distributed object technology is CORBA (common object request broker architecture). The aforementioned distributed object technologies are known in the art.
Distributed object technology allows object-oriented classes to be defined that include a server-side remote object and a client-side object stub, also called a “proxy.” The server-side remote object implements one or more services and a communication protocol to communicate with the client-side stub. The client-side stub provides the client with an API (“interface” in object-oriented programming terminology) to call functions (i.e., “invoke methods” in object oriented programming terminology). When a method is invoked on the client-side stub, a remote procedure call and a set of parameters are marshaled into a message and sent across a communication channel to the server-side remote object. The server-side remote object then receives the message, unmarshals the parameters, invokes the corresponding method on behalf of the client, marshals a set of results into a message, and sends the message back to the client.
Remote objects can be used to specify remote agents. A remote agent is a program that runs on a remote server node on behalf of a client. A remote agent can be defined as a static remote agent or a mobile agent. A mobile agent is a process that migrates from a first machine to a second machine to a third machine. For example, the first machine passes an object to the second machine and the object is executed in the second machine. After some processing, the mobile agent migrates to the third machine. In some cases the mobile agent clones itself or otherwise spawns off more agent objects to execute on the third machine. Mobile agents can be used to specify a query or to carry information, for example. A remote agent object is static if it only runs on the second machine but never migrates or clones itself to run on a third machine.
It would be desirable to have a smart browser technology that could allow a client to control multilevel operations on a remote server site. For example it would be desirable for a client to specify an operation such as generating a list of all pdf files downloadable from a given company's web site. It would be desirable for a smart browser to specify a search for all documents on a web site that contain a keyword or meet a set of criteria. It would be desirable to have fast methods to go through a set of results returned by a search engine. It would be desirable to provide a technique whereby a smart browser could interact with a server to customize, manipulate, and/or prune a set of returned results to more rapidly highlight and locate desired information. It would be desirable to have a distributed object framework whereby a smart browser could identify a set of multilevel operations and have these multilevel operations carried out on behalf of the client on a remote server. It would be desirable to have a system whereby a client could specify a mobile agent to carry out client-defined filtering and/or multilevel operations. It would be desirable for the client-created mobile agent to execute remotely on one or more network servers to carry out the tasks as specified by the client.