For many business and other applications, there is a need to report, query, or otherwise analyze hierarchical groupings of data. For example, a business need in project performance reporting is to support hierarchical groupings of projects, and to support the rollup of financial information for those hierarchies. As shown in FIG. 1(a), a user at a computer 104 or workstation might request a report indicating revenue for a given project, the request being passed through a network 102 to a business applications or other such server 106 which is able to extract the financial information from at least one appropriate database 108 and generate a report to be transmitted via the network to the user computer. Each of the projects in the hierarchical project grouping itself can contain a task hierarchy, such as a strict tree structure consisting of a set of individual tasks, wherein each task has only one parent task. Further, the top task in a hierarchy only has one parent node, namely the respective project, which can have its own data as well as multiple parents. Each task also can be linked to another project, which in turn can contain another task hierarchy. The projects themselves also can be arranged in a hierarchy, which further can allow for a project to not only have multiple children but also to have multiple parents and siblings in the structure as known in the art. It is then necessary to figure out how to allocate the information for these tasks between the various associated projects and tasks. For example, a task reporting revenue may need to be rolled up into multiple different tasks in its project, and the project containing that task may further need to be rolled up into other projects. It is desirable to be able to report, for example, the revenue amounts against any project into which a project or task rolls up. Thus, the overall structure is a nested hierarchy, as the project hierarchy can contain multiple hierarchies of tasks. These tasks further need to be able to rollup appropriately across the entire hierarchy. As the need for improved performance increases, existing approaches to hierarchy-wide rollups are no longer sufficient.
The overall project data model thus can be construed as a “tree” structure of projects, wherein each project itself includes a tree structure of tasks. As shown in the exemplary hierarchy 150 of FIG. 1(b), a project at the top of the hierarchy, such as project P1 or P2, has no ancestors, but can have multiple descendents, such as projects P3 and P4. For each of the projects that has descendants, one business need is to report the data of all the projects and tasks that are descendents of that project. Each project can have an arbitrary number of tasks and an arbitrary number of descendents.
Typically, information about such hierarchies is stored in a table consisting of links, indicating which tasks are linked to which other tasks. For example, a table for the project hierarchy 150 depicted in FIG. 1(b) would indicate that task T3 in project P2 has a link to parent task T1 and a link to child task T4. The link from one project to another project can be stored as a link from a task in the first project directly to the second project. For example, a table for the links depicted in FIG. 1(b) would indicate that task T6 in project P2 has a link to project P4.
A prior approach to obtaining the necessary rollup information was to “normalize” the link information into another table such that a single scan or query on a task in the normalized table returns all of the descendants or ancestors for that task. For a normalized table of the links for the project hierarchy 150 of FIG. 1(b), a query on task T2 of project P4 (i.e., P4.T2) can return tasks P4.T1, P2.T6, P2.T4, P2.T3 and P2.T1. When doing a rollup, then, the amount of each task is obtained, and the task is joined to this table such that for each task and its descendants the amount that needs to be reported against that task can be determined. However, this normalized table is exponential in size with the total number of tasks, or depth of the hierarchy, as the normalized table will have to list all the ancestors of each task. It is also exponential in the number of “levels” in the project hierarchy, and thus does not scale. To obtain all the data of the descendents of a task thus would only take a single scan, but the single scan would be on a table that grows exponentially and thus does not provide a scalable project hierarchical reporting solution.
In another prior approach which does not utilize a normalized table, data from descendents is retrieved by going directly to the table that stores links between tasks of a tree in a flat list. A recursive query then is performed on the task links, using multiple joins to obtain all descendants of a task. The query thus can run as a loop until all data is obtained. For reporting purposes, the recursive query needs to be run for each task in each project in the hierarchy, which results in exponential run-time. Furthermore, recursive queries on large data volumes consume a lot of memory, thus not providing an efficient project hierarchical reporting solution.