This invention relates to a method for implementing an acyclic directed graph structure, or acyclic diagraph, using a relational database. Acyclic diagraph structures are used frequently to model real world hierarchy systems. Examples of such hierarchy systems where an acyclic diagraph structure may be used are family trees, Object-Oriented relationship models, network routing structure, etc.
Despite the prevalence of acyclic diagraph structures in the real world, few computer applications actually use the model. Application developers prefer instead to either use another similar model, the directed tree structure, to emulate acyclic diagraph systems, or alternatively generate xe2x80x9cviewsxe2x80x9d of an acyclic diagraph system and represent the views with a directed-tree structure.
The main reason why application programmers tend to avoid modelling the acyclic diagraph structure is due to the complexity of the structure, and the complexity of implementing and maintaining this structure.
A graph is a mathematical system consisting of nodes (or points or vertices) and edges (or links or arcs). The following is a mathematical definition of a graph taken from a book by Jean-Paul Tremblay and Paul G Sorenson (ISBN0-07-065157-4):
xe2x80x9cA graph G consist of a nonempty set V called the set of nodes (points, vertices) of the graph, a set E which is the set of edges of the graph, and a mapping from the set of edges E to a set of pairs of elements of V.
We can assume throughout that both sets V and E of a graph are finite. It is also convenient to write a graph as G=(V, E). Notice that the definition of a graph implies that to every edge of the graph G, we can associate a pair of nodes of the graph. If an edge, xxcex5E is thus associated with a pair of nodes (u, v) where u, vxcex5V then we say that the edge x connects or joins the nodes u and v. Any two nodes which are connected by an edge in a graph are called adjacent nodes.
In a graph G=(V, E) an edge which is directed from one node to another is called a directed edge, while an edge which has no specific direction is called an undirected edge. A graph in which every edge is directed is called a directed graph, or a diagraphxe2x80x9d.
In addition to the above definition, the same book gives the following additional definitions:
xe2x80x9cAny sequence of edges of a diagraph such that the terminal node of any edge in the sequence is the initial node of the edge, if any, appearing next in the sequence defines a path of the graph. A path is said to traverse through the nodes appearing in the sequence originating in the initial node of the first edge and ending in the terminal node of the last edge in the sequence.xe2x80x9d
xe2x80x9cA path which originates and ends in the same node is called a cycle (circuit).xe2x80x9d
xe2x80x9cA simple diagraph which does not have any cycles is called acyclic.xe2x80x9d
From the above definition, the properties of an acyclic diagraph can be summarised as:
1. The structure consists only of nodes and edges.
2. Edges are uni-directional.
3. The direct relationship between any two nodes is represented by a maximum of one edge.
4. Indirect relationships, or paths, between any two nodes may exist through edges with other nodes.
5. Each node may have one or more child nodes.
6. Each node may have one or more parent nodes.
7. There may exist ancestor nodes for any given node, which consist of all the successive parents of a particular node.
8. There may exist descendant nodes for any given node, which consist of all the successive children of a particular node.
9. An ancestor of a node cannot be a descendant of the same node.
10. There exists an inverse structure to any given acyclic diagraph structure. The inverse structure to any given acyclic diagraph structure is the same structure but with all the edge directions reversed.
The traditional method to store and manipulate an acyclic directed graph structure in a computer is done using third generation programming languages. Basically this involves using defined memory storage areas as nodes, and pointers as edges. An example using C code is thus:
struct Node
char nodename [10]
struct Node *nextnode[4]}
Where nodename would represent the actual node, and nextnode would represent an edge to another node. Operations on the structure could be programmed using functions or procedures. Examples of such functions may be xe2x80x9cInsert Nodexe2x80x9d, xe2x80x9cDelete Nodexe2x80x9d, xe2x80x9cAttach Nodexe2x80x9d, etc.
Although this method can efficiently implement an acyclic diagraph, there are several considerations to be made with regard to the complexity and reliability of such an implementation.
1. The maximum number of edges associated with a node has to be known in advance.
2. Difficult to find existence of paths between nodes, as all possible paths have to be traversed before a conclusion can be made.
3. No consideration is made on to the reliability of the structure with regards to database concurrency and integrity.
4. No node locking mechanism is available, thus limiting the use of the structure to one user or operation at a time.
5. May encounter problems when using third generation languages with regard to memory allocation or lost pointers which would cause the whole program to crash.
Although a skilled programmer may be able to get around all these limitations, addressing each and every one of them only adds further complexity to the programming to be done. In addition, since everything is done programmatically, there is a greater possibility of hidden bugs in the systems, and the system maintenance may thus prove very difficult and time consuming.
This invention offers an alternative method of storing and manipulating acyclic directed graph structures by using a relation database.
In accordance with the present invention, there is provided a method for generating an acyclic directed graph structure, or acyclic diagraph, utilising a computer operated relational database, the acyclic diagraph comprising a plurality of nodes directionally coupled to one another so that each node has at least one child node in a forward coupling direction and/or at least one parent node in a reverse coupling direction, comprising:
generating a node table structure indicating a relationship between each node in the acyclic diagraph and at least one node property;
generating an edge table structure indicating a relationship between each directly coupled pair of nodes in the acyclic diagraph; and
generating a path table structure indicating existence of a path between any two nodes in the acyclic diagraph.
The present invention also provides a computer stored data structure in a relational database system for representing an acyclic diagraph comprising a plurality of nodes directionally coupled to one another so that each node has at least one child node in a forward coupling direction and/or at least one parent node in a reverse coupling direction, the data structure comprising:
a node table structure indicating a relationship between each node in the acyclic diagraph and at least one node property, the node table structure including a record for each node in the acyclic diagraph, each node table structure record including a node identification field and at least one node property field;
an edge table structure indicating a relationship between each pair of coupled parent and child nodes in the acyclic diagraph, the edge table structure including a record for each pair of parent and child.coupled nodes, each edge table structure record including respective fields to identify the parent node and the child node; and
a path table structure indicating existence of a path between any two nodes in the acyclic diagraph, the path table structure including a record for each pair of parent and child coupled nodes and pair of ancestor and descendant coupled nodes wherein two nodes are coupled through an intermediate, each path table structure record including respective fields to identify each of the pair of nodes.
A method of maintaining a data structure as defined above is also provided by the present invention, including:
a node creation procedure for creating a new node in the acyclic diagraph by inserting a new record to the node table structure with a unique value in said node identification field;
a node linking procedure for linking a child node to a parent node in the acyclic diagraph, comprising the steps of:
inserting a new record in the edge table structure identifying the child node and the parent node and a coupling direction from the parent node to the child node;
inserting a new record in the path table structure identifying the child node and the parent node and a coupling direction from the parent node to the child node;
selecting from the path table structure each ancestor node coupled to the parent node and inserting a new record in the path table structure identifying the child node and each ancestor node and a coupling direction from the ancestor node to the child node; and
selecting from the path table structure each descendant node to which the child node is coupled and inserting a new record in the path table structure identifying the child node and each descendant node and a coupling direction from the child node to the descendant node;
a node unlinking procedure for unlinking a child node from a parent node in the acyclic diagraph, comprising the steps of:
removing from the edge table structure the record identifying the child node and the parent node;
removing from the path table structure each record identifying the child node; selecting and removing from the path table structure each record identifying an ancestor node of the parent node and a descendant node of the child node; and
systematically reviewing the edge table structure beginning with records identifying an ancestor node of the parent node to determine any path from an ancestor node of the parent node to a descendant node of the child node, reviewing the path table structure to if any such path is identified by a path table structure record and inserting a new record into the path table structure identifying that path if none exists in the path table structure; and
a remove node procedure for removing a node which has been unlinked from the acyclic diagraph, comprising removing from the node table structure the record identifying the unlinked node.