A directed graph has the following properties, firstly the graph has one start node ns, i.e., a node without incoming edges, also called the initial node. Secondly, the graph has one termination node nt, i.e., a node without outgoing edges, also called the end or final node, and thirdly the graph fulfils the restriction that for all nodes n there exists a path from ns, to n and a path from n to nt, i.e., all nodes are reachable from the start node and can reach the termination node. A model of a directed graph is used in several related technical fields, for example, it is common to represent the architecture of a computer program as a flowchart, which is an example of a directed graph. Likewise business plans can also be represented as directed graphs.
The trend towards model-driven architecture and development, graphical modelling languages such as the Universal Modelling Language (UML) gain importance, as can be seen from the many new packages added to UML from its early days to its latest version, Version 2.0. The package for activity diagrams is such an extension for dynamic behavior.
UML 2.0 activity diagrams, like many other graphical modelling languages for business processes and workflows, use one or another form of flow graphs (directed graphs) and model a behavior with nodes and edges, where nodes represent activities and edges model the potential continuations from one activity to other activities. An edge can be seen as a visual representation of a “goto” as available in some programming languages.
Flow graphs can become cyclic when certain activities are executed more than once in a single run of the behavioral model. These cycles are called unstructured to distinguish them from the well-structured cycles modelled as explicit “while” or “repeat” loops. If unstructured cycles in business process or workflow models shall be eliminated and replaced by structured constructs, cycle-removal algorithms can be applied.
One area where cycle-removal algorithms are used is the field of transformations from a source metamodel that allows unstructured cycles to a target metamodel that only allows structured constructs. An example of such a transformation is the compilation of UML 2.0 activity diagram models to the Business Process Execution Language for Web Services (BPEL4WS). Another area where cycle-removal algorithms are useful is the management of complexity, where they can help analyze large flow graphs in order to understand the intended behavior. As in structured programming, structured loops are easier to understand than unstructured cycles.
Different algorithms can be used to remove cycles in sequential flow graphs, two are discussed in Hauser, R., Koehler, J. “Compiling Process Graphs into Executable Code”, Proc. 3rd International Conference on Generative Programming and Component Engineering, LNCS 3286, pp. 317-336, Vancouver, Canada, October 2004. The state-machine-controller method interprets the nodes as states and the edges as transitions. It executes the activities in a single while-loop and keeps track of what the next activity is going to be. This method can be applied to any flow graph. The goto-elimination-method on the other hand extracts the intended cycle-structure and replaces the cycles found with structured cycles. This method can only be applied, without additional aids, to flow graphs that are well-structured enough.
The connectedness of nodes in a flow graph with several nodes is a kind of measure for how well-structured the flow graph is. An acyclic graph is well-structured and can be transformed into a structured program with if-statements. If more edges are added, then, at a certain point in time, the flow graph becomes cyclic. The flow graph is still well-structured and can be transformed into a structured program with if-statements and one repeat-loop. As progressively more edges are added, the flow graph evolves until it eventually becomes unstructured. When every node has an edge leading to every other node, the flow graph has become completely unstructured and can be traversed in any possible way. Any set of nodes with two or more nodes contains cycles.
Compiler theory introduced the concept of reducibility to define when a flow graph is well-structured. The T1-T2 analysis is one algorithm to determine whether a flow graph is reducible. This was first detailed in the paper Hecht, M. S., Ullman, J. D.: “Flow Graph Reducibility”, SIAM J. Comput. Vol. 1 No. 2, pp. 188-202, 1972. The concept of reducibility has proven valuable over the past three decades as goto-elimination in programming languages can only be applied to programs whose control flow graph is reducible. If a flow graph is irreducible, auxiliary methods such as node-splitting can be applied to regain reducibility, but the consequence is that nodes, i.e., pieces of code, are duplicated and this is not always desirable. Since node-splitting increases the size of a flow graph, techniques have been suggested for the optimal strategy to split the nodes, e.g. as described in: Unger, S., Mueller, F.: “Handling Irreducible Loops: Optimized Node Splitting versus DJ-Graphs”, ACM Transactions on Programming Languages and Systems, 24(4), pp. 299--333, July 2002.