1. Technical Field
The present disclosure relates to directories and, more specifically, to fast searching of directories.
2. Description of the Related Art
Web based applications, for example, web services, are quickly transforming the way modern businesses interact and share information. Web services are software systems for providing particular functionality over a computer network, for example, the Internet. Web services can generally be identified by Universal Resource Identifiers (URI) in a fashion that may be analogous to the way websites may be identified by Uniform Resource Locators (URLs). Web services generally contain public interfaces and bindings that enable users and other software systems such as other web services to seamlessly utilize the functionality of the web services. In this way, web services enhance the way computers communicate with users and each other.
One particularly common web service is the web based directory service. Web based directory services allow for access to a directory across a computer network, for example, the Internet. A directory is a specialized database that is primarily used for allowing a large number of people to quickly look up information. A directory is not intended to be primarily used as a tool for the organization and storage of data and is therefore optimized for information retrieval and not necessarily information storage. Directories lend to be designed for particular purposes and are not commonly used for general purpose searches. For example, the types of searches that a directory will handle are usually known ahead of time.
A directory service is a computer application that allows for access to a directory. Directory services may conform to sets of standards such as, for example, the X.500 standard pertaining to electronic directory services. Users may interact with directories using standardized languages such as, for example, Directory Services Markup Language (DSML). DSML is a variant of Extensible Markup Language (XML), the human-readable communications language commonly used by web based applications for exchanging information between computers without regard to the computer's platfomm. DSML is specifically tailored for communicating directory information.
While some directory services are local and only allow for use on a closed computer network, other directory services are global and allow for general access over an open computer network such as the Internet.
Directory services may have redundant servers placed over a broad geographic area all of which cooperate to provide directory service. Such directory services are known as distributed directory services. The Internet Domain Name System (DNS) is an example of a global distributed directory service. The DNS allows computers connected to the internet to look up the numeric internet address from the corresponding internet domain name.
LDAP, or Lightweight Directory Access Protocol, is a protocol for quickly and easily accessing directory services from across a computer network. LDAPs communicate using TCP/IP transfer services or similar transfer services making LDAPs well suited for use over the internet or private company intranets.
An LDAP directory is often made up of objects. Each object may contain a number of attributes. Attributes may each be of a particular type and may each have one or more values conforming to that type. Examples of types of attributes include “cn” for common name, and “mail” for an email address. Other types of attributes include text, photos, URLs, pointers, binary data, etc. The correct syntax of the value is defined by the particular type of attribute. For example, an email address with the type “mail” may have a value of “bob@domainname.com”.
Objects within LDAP directories can be hierarchically arranged for more efficient searching. For example, a hierarchical LDAP directory made up of domain name objects might begin with “.com”, “.org” and “.gov” objects at the top level of the hierarchy. Below each top level object may be a series of objects representing organizations, and within each of these organization objects may be a series of objects representing users. Hierarchical objects are commonly referred to as parent objects and child objects depending on their relationship to one another. For example, an object representing a printer may be the child of an object representing a computer in a hierarchical directory representing a computer network where the printer is connected to the computer. An object of any hierarchical generation may have one or more associated attributes. Attributes may be used to describe characteristics of the objects they are associated with. Each object and/or attribute may have one or more associated values.
FIG. 1 is a block diagram showing an example of a hierarchical directory structure. Here the top level object (root object) is “organization.” This object has a string type and a value of “Computer Associates.” The child objects of the root object are called “office.” These objects have a string type and values of “R&D,” “Sales,” “Legal,” and “Marketing.” The child objects of the office objects include “person” or “equipment” objects. These objects have a string type and have various values.
LDAP directory services are commonly based on a client-server model. While one or more LDAP servers contain the LDAP data, a client may be launched by a person seeking to access LDAP directory data. The client may connect to the server and communicate the search criteria. The server may then communicate the search results to the client. The client may then communicate the search results to the user.
One common example of a LDAP directory service is a service that resolves email addresses from names. Such directory services are commonly accessed by email clients that connect to email servers. In this example, the user can enter a contact's name to resolve the contact's email address.
While directories, such as those utilizing LDAP, may be well suited for the quick execution of basic search queries, directories may not be able to handle some of the more complex search queries that may be commonly handled by general purpose relational databases. For example, many directories are unable to perform the common inner join (join) operation. The join operation retrieves all objects having two or more attributes, child objects and/or child object attributes being searched for. For example, given the example directory structure illustrated in FIG. 1, a join operation may be to retrieve a list of all offices that have a person named Alice and a photocopier.
This information may still be retrieved from a directory without the use of a join operation; however, retrieval may require multiple searches. For example, a first search may be made to determine all offices with a person named Alice. With respect to the example directory structure of FIG. 1, the results would be R&D and Sales. A second search would be to search R&D for a photocopier and then to search Sales for a photocopier to determine if any of the offices with an Alice also have a photocopier. This second search may be executed using multiple searches or may be combined into one long and complex search, however the results would be comparable. This technique may be long and complicated, especially with a real-world directory which might contain thousands of entries.
The need to conduct multiple searches to perform a common join operation may be a great shortcoming of directory technology. This shortcoming often frustrates newcomers to the field. Unfortunately, this shortcoming is largely intrinsic to directories. Redesign of directory technology to resolve this shortcoming in the general case may require major changes to the X.500/LDAP directory standards. Additionally, even if these standards were to be revised, most functioning directory architecture might not be capable of supporting the enhanced functionality.