Parent-child relationships are very common in relational database schemas. For example, a database schema might indicate a relationship between a manager (the “parent”) and one or more employees (the “children”) who are subordinate to that manager. For another example, a database schema might indicate a relationship between a purchase order (the “parent”) and one or more lines (the “children”) that are a part of that purchase order. Data that contains such parent-child relationships is called hierarchical data, since the parent-child relationships form a hierarchy.
Typically, the way that a parent-child relationship is modeled within a relational database table is to include, within the table that contains the “child” records, a column that indicates a row identifier for a “parent” record that is related to those child records in a parent-child relationship (in a relational database table, each row of the table may have a unique row identifier). For example, in an “employee” table, an employee “Alice” might be a child record of another employee “Mel;” Mel might be Alice's manager, for example. The employee table might contain a “manager ID” (i.e., the parent ID) column. In the employee table, the row containing Alice's record might contain, in the manager ID column, the row identifier for the row (in the employee table) that contains Mel's record. Taking the example further, if Mel is also the manager for employees “Vera” and “Flo,” then, in the employee table, the rows containing Vera's and Flo's records might also contain, in the manager ID column, the row identifier for the row that contains Mel's record. If Mel himself has a manager, then the value of the manager ID in Mel's record might also indicate the row identifier for the row that contains Mel's manager's record.
Structured Query Language (SQL) has operators and constructs that allow an interested user to formulate queries that, when executed, will (a) cause all of the children of a specified parent to be returned, (b) cause all of the descendants (i.e., the children of, the children of those children, etc.) of a specified parent to be returned, (c) cause the parent of a specified child to be returned, and/or (d) cause all of the ancestors (i.e., the parent of, the parent of that parent, etc.) of a specified child to be returned.
Using such an approach, determining the immediate parent or children of a specified entity may be accomplished fairly trivially using a simple index lookup. However, determining more extended relationships, such as all ancestors or all descendants of a specified entity, can be much more complicated. Although SQL contains a “CONNECT-BY” construct that permits such complicated operations to be expressed in a relatively compact manner, evaluating a query that contains such a construct involves an N-step algorithm in which each step involves a separate index scan. For example, if a query requests all of the descendants of a specified entity, and if there are N levels of descendants represented in the hierarchy, then evaluating the query will involve at least N separate index scans.
Where N is very large, evaluating such a query can be very costly in terms of both time and computing resources. Database system performance can suffer when such a query is evaluated. What is needed is a way of obtaining, from a database, hierarchical relationship information, such as all of the ancestors or descendants of a specified entity, without incurring the high costs that are associated with large numbers of index scans. The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.