There have been recent increases in Websites providing applications which access directories at the back end to carry out functions such as answering a client query, creating customized Webpages, or authentication. Directories are resource repositories which are organised to enable locating of records called directory entries.
The Lightweight Directory Access Protocol (LDAP) specification provides a software protocol for accessing and managing remote and possibly distributed directories. LDAP provides a search operation which allows users to query the directory for entries satisfying a search filter. LDAP servers were initially used as gateways to provide a TCP/IP interface to the X.500 directory server. As LDAP gained in popularity, the LDAP server ceased to be a front-end to the X.500 directory server and the directory itself became a part of the LDAP server. The current LDAP specification is the Internet Engineering Task Force (IETF) Network Working Group's Request for Comments (RFC) 2251, “Lightweight Directory Access Protocol (V.3)”, Wahl, Howes & Kille, December 1997. At the time of writing, the LDAP specification and related specification RFC 2252, “Lightweight Directory Access Protocol (v3): Attribute Syntax Definitions”, Wahl, Coulbeck, Howes & Kille, December 1997, is available from Website ‘www.ietf.org’.
The data model of directories supports representation of heterogeneous real world entities in a single instance of the directory. LDAP directories are being used to store address books, contact information, customer profiles, network resource information, policies and other data files and resources.
To improve scalability and availability of directory based Web services, for the Internet and intranets, it is desirable to be able to cache results of LDAP directory queries and to use the cached results for answering future queries. However, unlike other Web content, individual resources (entries) within LDAP directories are not accessed directly but instead use LDAP queries. Therefore, techniques used for Web page caching cannot be used for LDAP resources. For an LDAP cache to answer a query, it needs to check whether the query is semantically contained in earlier queries.
Active query caching in databases is disclosed in Qiong Luo, Jeffery F. Naughton, Rajasekar Krishnamurthy, Pei Cao, and Yunrui Li, “Active Query Caching for Database Web Servers”, WebDB 2000: Third International Workshop on the Web and Databases, Dallas, Tex., May 18–19, 2000, in conjunction with ACM SIGMOD'2000.
However, active proxy caching of database queries for Web applications is typically implemented by application-specific servlets running at the proxy, and this makes the solution application dependent.
U.S. Pat. No. 6,347,312 issued to Byrne et al for “Lightweight Directory Access Protocol (LDAP) directory server cache mechanism and method” describes a method of performing caching on an entry basis rather than a query basis. The approach of U.S. Pat. No. 6,347,312 does not check query containment, and so an incoming query is evaluated against the origin directory even if the query is contained in a stored query.
Sophie Cluet, Olga Kapitskaia and Divesh Srivastava in, “Using LDAP directory caches”, Proceedings of the ACM Symposium on Principles of Database Systems (PODS), 1999, consider the problem of determining when a query can be soundly and completely answered. The authors consider various types of incoming queries and stored query templates and analyze the complexity of the problem of finding answerability. However, Cluet et al do not describe a solution for checking whether an incoming query contains or is contained within any of the stored queries.
P.-A. Larson and H. Z. Yang in “Computing queries from derived relations”, Procs. of 11th VLDB, 1987, consider answering of database queries from derived relations, both represented as project-select-join (PSJ) expressions. They discuss algorithms for testing coverage and derivability of a PSJ expression by another PSJ expression. However, the stored condition expressions associated with the derived relations are not classified or indexed for easy lookup to find coverage. When a query comes, it is tested for coverage against all the derived relations. In applications where a large number of derived relations are stored in the cache, this approach is inefficient since an incoming query is tested against many derived relations which cannot possibly answer the query. Such a computation-intensive solution is not suitable for simpler LDAP queries.
Olga Kapitskaia, Raymond T. Ng and Divesh Srivastava in “Evolution and revolutions in LDAP directory caches”, published in proceedings of the International Conference on Extending Database Technology (EDBT), 202–216, 2000 consider the problem of improving the hit-ratio of an LDAP directory cache by performing a cost benefit analysis of having a query template stored in the cache. The authors propose algorithms for determining whether it would be beneficial to have a query template in the cache. The authors, however, do not propose algorithms for determining whether an incoming query can be answered from stored queries.
There exists a need in the art for a caching solution for directory queries which can be used by different applications and directories. Independently, there also exists a need in the art for caching solutions which provide efficient checking of query containment when processing received queries.