In order to improve performance of operations such as searches, sorts, and others, it is often useful to create and maintain a search index data structure. A search index enables efficient matching of tokens within a search query to documents containing those tokens. For the contents of a document to be represented in a search index, the document must go through an indexing step, resulting in information describing the document contents being added to the index.
As search services become foundation services provided on a desktop (e.g. Google™ Desktop Search), or as part of the underlying operating system itself (e.g. Microsoft® Windows®), it becomes natural that applications not implement their own search features, but instead index their data using a shared, system-wide index. For example, a messaging application that provides a searchable history of messages might not implement its own search index, but instead could simply be designed to push or otherwise make messages it desires to be indexed available to a shared, global index.
In order to support search indexes that are shared across multiple applications, providers of existing search technologies are publishing APIs (Application Programming Interfaces) that allow applications to push their data into the index. One example of such an approach is found in the Microsoft IFilter API. This API is used by the Microsoft Windows operating system to make various file types searchable by a service that is part of the operating system. To make files of a specific type searchable, applications must implement a specific interface, create an indexing filter, and register the indexing filter for a specific file extension. When the service detects a new file or a change in a file, it loads the indexing filter associated with the file type and uses it to index the content of the file.
Another example of an existing search technology API is found in the Google Desktop SDK (System Developer's Kit). This API has two flavors: 1) an API similar to the Microsoft IFilter API, through which applications register indexing filters for corresponding file types, and the indexer uses the specific indexing filter when a file of the corresponding type is indexed in a pull operation, and 2) an API that allows applications to push data directly into the index.
A significant shortcoming exists in these existing solutions, since they operate at indexing time only, and accordingly are limited with respect to providing security. Specifically, these systems are inadequate when a centralized search index may contain data on behalf of several different users. In that case, there is a need for a search service that processes search results so that results are only presented to a user if that user has access to the corresponding data. Moreover, since security logic usually belongs to the application from which the data was indexed in the first place, appropriate search result filtering cannot be performed by existing systems, since they do not provide a mechanism for accessing each application's security logic at search time.
Other existing technology has provided security with regard to a specific type of content. In IBM® Websphere Portal, a security model has been used based on Portal Access Control with regard to Portal Pages that are indexed. This type of approach, although providing an index with secured data, is restricted to using only one security mechanism (Portal Access Control) and only one content type (Portal Pages). A system-level search service should instead provide an extensible framework in which multiple applications can conveniently introduce new content types and new security mechanisms for such new content types.
For the above reasons and others, it would be desirable to have a new system for securing application information in a shared, system-wide search service.