Presently, setting up a search system for network resources requires a significant administrative effort on the part of whoever is preparing the system. An administrator must know in advance what type of resources one wishes to search. Also required is configuring a data storage mechanism (also known as a data store) to hold the relevant information which will be captured in the search mechanism. In addition the administrator must write (or set up) software to perform the task of indexing network resources and storing this information in the data store. Then, after these mechanisms have been prepared, the administrator can create an interface to the search system, often through a web interface to execute SQL on data stored in a database.
There are inherent disadvantages in such search systems which are presently available to those who wish to set up a search system. For example, there exists a substantial overhead to those who wish to set up such a system. The designer of the system must know beforehand what types of resources they wish to search, and what properties of those resources they wish to index. Because standard data storage mechanisms (such as data bases) tend to store static data arrangements, changes to the structure of a data storage mechanism are generally costly and not easily accomplished.
Further, the administrator of such a search system must configure a mechanism to index available resources and store the information pertaining to those resources within their data store. This can be accomplished via certain scripting methodologies, or may require writing new software. In either case, work is involved in setting up the indexing mechanism and configuring it to pull the correct necessary properties about different resources. Work is also involved in then linking this indexing mechanism to the data store. In addition, the administrator must set up a way for users to access the search system, often through a mechanism which queries the database based on its static structure and returns results in the form of an HTML document.
Overall, the static nature of both the indexing and data storage mechanisms in presently available systems leave them poorly equipped to handle the integration of multiple types of network resources, or network resources available via different access methodologies. For example, while it might be reasonable to construct a search mechanism for HTML documents, or for Microsoft® Word documents, integrating the two becomes significantly more complex. For example, to store the unique properties of HTML documents and Word documents which the two don't share, additional columns and tables would have to be added to the storage mechanism. Then, the query mechanism would have to be enhanced to be able to utilize the properties of the two types of documents, but still allow for the fact that some documents might not have requested properties. One might desire to use the author property of Word documents to refine their search, but this leaves the question of how to handle HTML documents which don't possess a known author. Furthermore, the indexing mechanism would also be complicated because it would have to index in two separate locations and for two separate types of files. An alternative might be to set up multiple search systems, but this poses open questions regarding how to integrate the results of these systems into a unified system which is easily accessed by the end user.
Further, presently available systems do not offer a sufficient mechanism for special case handling of search results based on their type or network access methodology. Most current search systems simply return results in the form of links through a web browser or other display mechanism. This leaves the access up to the web browser/operating system. While this works well for web documents, it limits possible results to those accessible by the web browser/operating system and those results which can be expressed as a link. It also lacks the ability to specify different access methods for different results. For example, one result might be from a computer across a LAN, while another might be from an FTP site. Finally, it lacks the ability to utilize special functionality available to certain resource types. For example, a Word document can be merged with an address list, which HTML documents cannot do. With presently available search systems an entirely separate layer would need to be written on top of the above described search system for accomplishing such tasks.
Thus, there are many areas for improvement within the current systems. It is desirable to have a system which is modular to handle different types of resources and network access methodologies. This modular system could easily handle both different access methodologies and different types of resources, and have a generalized system for searching and handling the different properties of different resources. The end goal of this system would be to create a single search system for end users which could return all results regardless of their type or location, and allow them to be accessed through a unified interface. Also, these “modules” would be reusable which would considerably simplify the work involved in multiple deployments, and would mask the underlying issues of data storage simplifying the task of the administrator. All of these goals can be accomplished by designing a framework into which modular searchable objects can interact, and building into the framework the necessary functionality to handle other complexities of the system.