The information network known as the World Wide Web (WWW), which is a subset of the well-known Internet, is arguably the most complete source of publicly accessible information available. Anyone with a suitable Internet appliance such as a personal computer with a standard Internet connection may access (go on-line) and navigate to Universal Resource Locators (URL's), also termed information pages or WEB pages, stored on Internet-connected servers for the purpose of garnering information and initiating transactions with hosts of such servers and pages.
Many companies offer various subscription services accessible via the Internet. For example, many people now do their banking, stock trading, shopping, and so forth from the comfort of their own homes via Internet access. Typically, a user, through subscription, has access to personalized and secure WEB pages for such functions. By typing in a user name and a password or other personal identification code, a user may obtain information, initiate transactions, buy stock, and accomplish a myriad of other tasks.
One problem that is encountered by an individual who has several or many such subscriptions to Internet-brokered services is that there are invariably many passwords and/or log-in codes to be used. Often a same password or code cannot be used for every service, as the password or code may already be taken by another user. A user may not wish to supply a code unique to the user such as perhaps a social security number because of security issues, including quality of security, that may vary from service to service. Additionally, many users at their own volition may choose different passwords for different sites so as to have increased security, which in fact also increases the number of passwords a user may have.
Another issue that can plague a user who has many passworded subscriptions is the fact that they must bookmark many WEB pages in a computer cache so that they may quickly find and access the various services. For example, in order to reserve and pay for airline travel, a user must connect to the Internet, go to his/her book-marks file and select an airline page. The user then has to enter a user name and password, and follow on-screen instructions once the page is delivered. If the user wishes to purchase tickets from the WEB site, and wishes to transfer funds from an on-line banking service, the user must also look for and select the personal bank or account page to initiate a funds transfer for the tickets. Different user names and passwords may be required to access these other pages, and things get quite complicated.
Although this preceding example is merely exemplary, it is generally known that much work related to finding WEB pages, logging in with passwords, and the like is required to successfully do business on the WEB.
A service known to the inventor and described in the related case listed under the cross-reference to related documents section provides a WEB service that allows a user to store all of his password protected pages in one location such that browsing and garnering information from them is much simplified. A feature of the above service allows a user to program certain tasks into the system such that requested tasks are executed by an agent (software) based on user instruction. The service stores user password and log-in information and uses the information to log-in to the user's sites, thus enabling the user to navigate without having to manually input log-in or password codes to gain access to the links.
The above-described service uses a server to present a user-personalized application that may be displayed as an interactive home page that contains all of his listed sites (hyperlinks) for easy navigation. The application lists the user's URL's in the form of hyperlinks such that a user may click on a hyperlink and navigate to the page wherein log-in, if required, is automatic, and transparent to the user.
The application described above also includes a software agent that may be programmed to perform scheduled tasks for the user including returning specific summaries and updates about user-account pages. A search function is provided and adapted to cooperate with the software agent to search user-entered URL's for specific content if such pages are cached somewhere in their presentable form such as at the portal server, or on the client's machine.
An enhancement to the personalized system described above allows a software agent termed a gatherer agent (browser navigation control) to, in cooperation with a search function, navigate by proxy to any user-entered URL and return updated data back to the user in the form of an HTML information pace, which appears in the user's browser window. The enhancement is accomplished with the use of site-logic scripting based on pre-known information about the URL or URL's from which a user wishes to obtain data. In this way, current data specifically requested by a user may be found and retrieved for the user.
The process described above is initiated by a user query that is entered into a search function dialog box provided with a user's personal portal page. The query may be presented in natural language adding a level of user friendliness to the process. Moreover, auto log-in to password protected sites may be performed on behalf of users by virtue of the system's compilation and storage of user and WEB-site related data.
A limitation exists in the personalized system described above in that the search function may not search beyond the indexed or known URLs listed in the service database and attributed to a requesting user. That is to say that the search function cannot proceed beyond the first level of WEB site depth. Therefore, manual navigation must still be performed by a user who desires to obtain data referenced at a deeper level than in an indexed or registered URL. Of course, a user practicing the above system may physically register any new URLs with the service such that they may be included in the search criteria and site logic may be developed for obtaining summary data contained in the new URLs. However, when performing a general search, a user may not know the URL where the desired data is held.
In a general sense, as opposed to the personalized system described above, the current technology of searching URLs over a DPN such as the Internet with a search engine involves entering a query into a search dialog box and submitting the query to a server hosted by the provider of the search function. The requested data is compared to data held in a database containing cached URLs, which may contain data matching a query. Matching URLs (URLs with data content matching a query) are then returned to a user's browser window for browsing and selection as is generally known in the art.
There are many different methods and criteria used by search engines for searching out data on the Internet. Most typical is the use of key words or phrases that are used to find matches in text contained on an indexed WEB page (URL). Other methods include searching by site, searching for video, searching for photographs, searching for audio, and so on.
The above technology is limited in a general sense as described above by a fact that all URL pages containing information which may be desired by a user, are not listed in conventional search engine databases. In fact, there are a vast number of URL resources that are maintained on the Internet that are not listed in any search engine database and therefore may not be found through a query-type search method.
For example, a main page having a URL and hosted by an enterprise may contain several links to pages that contain additional information.
However, only the main page of the site is typically indexed on any given search-engine list unless a host of the site or other entity submits the additional URLs to be included in a database held by the enterprise hosting the search engine. Therefore, in order to obtain the additional data from un-indexed sites, a user must navigate to additional sites from a "jump-off page" found during the original search.
Many enterprises, especially companies hosting many pages, provide a convenient search engine function embedded into a page at a main URL, which is indexed in the conventional search-engine databases. In this way, a user may search for the main site, invoke the returned link, and then use the provided private search engine to explore the additional pages or look for additional data related to the site as a whole. Such a WEB site may be a company or enterprise site comprising many related WEB pages.
When a user invokes a private search function on a main page, he must enter a new query into the private search engine to look for the additional data. He or she is no longer using the original search engine to look for the data. Moreover, the private search engine provided at a site's main URL may function by different rules then the original search engine requiring a user, in many cases, to restructure the original query.
In the personalized system described further above wherein pre-knowledge exists about the user and WEB page site-logic is known pertaining to how data is hosted at the service-site marked by a known URL, the limitation described in a general data-search system still exists. That is, URLs may not be found if they are not pre-known or indexed.
What is clearly needed is a method and apparatus that enables a search engine to find and obtain data from URLs that are not indexed by a search engine database or otherwise pre-known to a requesting user or search-hosting service. A method and apparatus such as this would allow a user to obtain data that would otherwise have to be obtained by further browser navigation.