1. Field
The present application relates to the operational performance of a database, particularly its performance when directory services are being provided.
2. Description of the Related Art
Data managers utilize electronic directories as a tool to integrate data from applications that are not interconnected and to simplify the management of that data. Some electronic directories are standardized, e.g., X.500 or LDAP, and provide various services, e.g., search services, to store and retrieve information quickly. An X.500 or LDAP search service is performed using arguments which indicate, if applicable, where to start the search (baseObject), the scope of the search (subset), the conditions to apply to the search (filter) and what information should be returned (selection). In addition, flags are passed, for example, timelimit, sizelimit, etc to indicate whether aliases, if any, should be dereferenced (searchAliases).
A filter may be empty, consist of a single filter item or may be a compound expression of filter items using connectives (e.g., AND, OR, NOT). The connectives can be used to form arbitrary complex filters. For example, a user may wish to interrogate a directory in order to locate (if any) titles of managers of an organization who have salaries above $60,000.00, and have a mobile phone listed in the database or who are not in certain offices (that begin with “Melb” or begin with “Syd”) of the organization.
Such a request could be carried out by way of a search query where:title=manager AND salary>60000 AND (mobilePhone present OR (NOT (locality=Melb*OR locality=Syd*)))  expression 100
The above expression 100 can also be expressed as a logical expression:A.B.(C+!(D+E))  expression 101where the filter items are represented by letters (A, B, C, D and E) and the connectives are represented by “.” for AND connectives, “+” for OR connectives and “!” for NOT connectives.
Where the database has a metadata design, the above expression 101 can be converted into an SQL statement involving self joins of the form:SELECT eid FROM table WHERE (a) AND eid IN (SELECT eid FROM table WHERE (b)) AND eid IN (SELECT eid FROM table WHERE (c) OR eid NOT IN (SELECT eid FROM table WHERE (d) OR (e)  statement 102where “eid” is used to represent the entry identifier returned by the search and (a), (b), (c), (d) and (e) represents the relevant SQL for each of the filter items.
It is generally desirable, at least from a user's point of view, to have a relatively quick response to queries. Certainly, in SQL based X.500 or LDAP applications running in association with relational database(s), where there may be millions of entries searchable, response time is important. The time taken to process an SQL statement generally increases rapidly with increasing complexity, as is illustrated in FIG. 1. This effect is most noticeable if the SQL includes the NOT connective. This increase in complexity also raises the risk that a result is not returned before a time limit or size limit is reached. As a result, developers of such applications tend to avoid relatively complex SQL statements.
Optimising Terms that Contain OR Connectives
Using logic principles, e.g., Boolean logic, an arbitrarily complex search filter can be expanded to a number of relatively simpler terms. Generally, this involves removing ‘brackets’ in the complex search filter, resulting in an expression which is an “OR of ANDs” which is also known as a “sum of terms”. Expanding a complex filter search can also involve distributing NOT connectives so that each NOT connective applies to a single filter item.
An example of expanding a NOT connective in the filter in expression 103 below:!(A.B)  expression 103results in the filter terms:!A+!B  expression 104where !A is one filter term and !B is the other filter term. Similarly, expanding the NOT connective in the filter of expression 105 below:A.B.(C+!(D+E))  expression 105results in the filter term:A.B.(C+(!D.!E))  expression 106This filter term can be further expanded by removing the brackets and the resulting filter terms are A.B.C. and A.B.!D.!E in expression 107 below:A.B.C+A.B.!D.!E  expression 107
The expansions above resulting in expression 107 show two filter terms OR'ed together. Each filter term of expressions 104 and 107 can be processed and the results summed by the application, ignoring duplicates. This has a number of advantages:
a. The generation of simpler SQL statement resulting in faster processing of the search;
b. More efficient SQL statements and enhanced NOT connective processing, that is described below and in more detail in U.S. application Ser. No. 09/427,266 which is incorporated herein in its entirety by reference; and
c. The ability to impose and check size and time limits as a database service query is performed, often leads to performance improvements.
Optimising Terms that Contain AND Connectives
After a complex filter search has been expanded into a number of simpler filter terms, the resulting filter terms, containing only AND connectives, can still result in relatively inefficient expressions when converted into an SQL statement. For example, the expression:A.B  expression 108can be converted to an SQL statement of the form:SELECT eid FROM table WHERE (a) AND eid IN (SELECT eid FROM table WHRE (b))  statement 109
However, a filter term can contain more than one AND connective, as exemplified by expression 107 above, and the more AND connectives in the expression, generally, the less efficient the resulting SQL statement may be especially when considering processing response time.
One method for reducing the complexity of an SQL statement is known as ‘flattening’. Nested SQL statements can be reduced by flattening the SQL statement. For example statement 109 can be flattened to produce an SQL statement of the form:SELECT t1.eid FROM table t1, table t2 WHERE t1.(a) AND t2.(b) AND t1.eid=t2.eid  statement 110
This statement 110 comprises a number of self-joins (one for each AND connective) and allows a query optimizer to process the stronger clauses first, thus reducing the overall processing time required to evaluate the query.
Expressions that involve NOT connectives, may result in flattened queries that include a LEFT JOIN. These joins are generally time consuming to process, and thus ‘flattening’ of such a query is considered to require improvement.
For example, the expression:A.!B  expression 111Which can be converted to a flattened SQL statement of the form:SELECT t1.eid FROM table t1 LEFT JOIN table t2 ON t1.eid=t2.eid AND t2.(b) WHERE t1.(a) AND t2.NORM IS NULL  statement 112
If the filter term involves more than one NOT filter item, e.g. A.!B.!C then the flattened query involves an LEFT JOIN for each NOT but also a join for each combination of NOT filter items. This can become very complex. There is thus still considered to be a need to avoid relatively complex SQL statements in a manner that improves service query performance.