This disclosure relates generally to network searching systems, and more particularly to systems and methods for improved index structures for faster and better network searches.
One conventional tool for searching for internet content is a “web crawler.” A web crawler is a semi-autonomous software program that “crawls” completely over individual web sites of a network such as the Internet, corporate Intranet, etc. The web crawler typically iterates over all resources of a site, to determine the content of each page of a site. Thus, the web crawler is able to detect page changes and to trigger indexing of those changes. However, since changes to pages of a site occur regularly and numerously, the index is almost never up-to-date, and having to perform such crawls can lead to performance issues.
Another approach includes indexing the complete resource structure of the content on a page. Yet this approach does not index the same data as will be viewed by a user, and leads to mismatches between a search request and a search result. Further, hypertext markup language (HTML) preview of such search results is not supported. An example of this approach can be found in the Web Page Composer tool of the SAP Enterprise Portal, provided by SAP AG of Walldorf, Germany. The Web Page Composer is a visual composer tool to create web sites in an Enterprise Portal or other such interface.
Using the Web Page Composer tool, web sites are stored in a knowledge management repository, a particular type of relational database, as complex resource structures. From these complex resource structures, a display component of the Web Page Composer builds a page that can be viewed by a user. The index-relevant data is only accessible via the displayed page, which is also viewable by the user. All modifications of the page are applied to the complex resource structure, which is itself not “aware” as to how and where the associated page is to be displayed.