1. Field of the Invention
The field of the invention is data processing, or, more specifically, methods, systems, and products for personalized indexing and searching for information in a distributed data processing system.
2. Description of Related Art
An example from current art of a large distributed data processing system is the World Wide Web. Search engines on the web are basically massive full-text indexes of millions of web pages. These search engines are specialized software programs specialized to receive search query messages from users or from users' browsers, where the search query messages comprise keywords or search terms. Search engines formulate, or ‘parse,’ the query messages into database queries against web search databases comprising massive search indexes.
The web includes many web sites comprising many millions of web pages, each of which is a document specially structured in a markup language, such as, for example, HTML, WML, HDML, and so on, to support some hyperlinking in some data communications protocol, such as, for example, HTTP, WAP, HDTP, and so on. The search indexes for the search engines are created by software robots called ‘spiders’ or ‘crawlers’ that survey the web and retrieve documents for indexing. The indexing itself is often carried out by another software engine that takes as its input the pages gathered by spiders, extracts keywords according to some algorithm, and creates index entries based upon the keywords and URLs identifying the indexed documents.
That is, spiders gather documents into a documents database, identifying the documents to be gathered from a URL list in the documents database or through hyperlinks in the documents themselves or through other methods. Spiders take as their inputs the entire web and produce as outputs documents to be indexed. Indexing engines take as their inputs documents to be indexed and produce as their outputs search indexes. Search engines take as inputs search indexes and search request messages bearing search terms and produce as their outputs search result messages for return to requesting users' browsers.
In current art, spiders gather documents with no regard for individual users' interests or history of web navigation. In current art, index engines create search indexes with no regard for individual users' interests or history of web navigation. In current art, search engines create responses to search queries from users with no regard for individual users' interests or history of web navigation. If searches could be performed with regard for individual users' interests or history of web navigation, searches could be better focused and search results could be more pertinent to users' purposes in searching for information. There are ongoing needs for improvement, therefore, in searching and indexing information in large distributed data processing system like the web.