1. Field of the Invention
The present invention relates to flow modeling tools in general and to the visualization of flows in particular.
2. Discussion of Related Art
Flow modeling tools are frequently employed during the design phase of a flow process, such as a workflow process or a dataflow process. All flow modeling tools comprise a graphical user interface on which the flow to be designed can be, drawn. This is done in most flow modeling tools by selecting operators from a palette and placing them on the graphical user interface. Operators are steps composing the flow.
In the case of a dataflow, an operator can be any kind of data source (e.g., database, flat file, etc.) providing some data, any kind of transformer applying a transformation on the data, or any kind of target in which the data can be loaded.
In the case of a workflow, an operator can be any kind of activity playing a role in the flow (e.g., invocation of a service, notification to the user, etc.).
In the case of the design of a computer network, an operator can be any kind of hardware (e.g., computer, router, etc.) providing or consuming some data.
In many cases, each operator can have a list of possible input and output channels. The input channels can accept incoming data to be processed by the operator, while output channels produce transformed data that can be used as an input by another operator.
The flow itself is defined by drawing some links between the operators composing the flow. Each link connects an output channel of one operator with an input channel of another operator. A link thereby represents a data transfer or a logical transition between two operators. A link between two operators can also represent a physical apparatus, such as a network connection when a dataflow or a computer network is modeled.
Modeling flows is however becoming more and more complex due to the many components involved as well as heterogeneous environments becoming more popular. For example, when modeling data flows in an ETL (extract, transform, load; meaning processes to extract data from heterogeneous sources, transform them and finally load them in a data warehouse) scenario between databases, this usually involves a varying range of databases located on different servers, connected through arbitrary network connections. Data is thereby passed from one server to another over a network connection. Bottlenecks can stem from network connections being too slow for a required data throughput. Servers incapable of delivering data at a required speed will limit the utilization of successive servers. Moreover, peak loads on an individual server can cause it to have too high a load, thus preventing it from handling other tasks scheduled for that server.
Designing a flow process is typically twofold. In the first phase, the flow is designed by use of a respective device. In the second phase, the design is deployed in the target environment. Keeping a good overview of the environment is essential, but does not guarantee for optimum results.
Furthermore, problematic areas are most often only identified when the design is finally implemented. Resolving issues at that point is more costly than during design time and additionally can produce unrecoverable losses.
Up to now, flow design tools only display what the user designs, but do not take any performance values of the operators or the links into account. An exemplary performance value of an operator which represents, for example, the JOIN operation is the maximum speed at which data is processed. An exemplary performance value of a link which represents, for example, a network cable is the maximum bandwidth the cable supports. Annotations are possible but limited to an individual object only. An impact on related objects is not taken into account nor visualized accordingly. Users can design anything that the respective device allows, but unreasonable or impossible designs are not prevented or at least indicated.
Also, interactions among the individual components, the operators, input/output channels and links that make up the dataflow, are not taken into account.
Another class of existing tools are simulation tools. Flow design tools and simulation tools are frequently comprised in one product. The simulation tools do a precise simulation of how a finished design will behave in a real environment. With the currently available tools, designing a flow and considering the performance of the flow is therefore carried out in two phases. The design phase enables the operators and the links to be arranged on the graphical user interface by the user so that the resulting flow fulfills user requirements. In the simulation or measurement phase, the performance of the flow is simulated. According to the result of the simulation, the user changes the design of the flow so that the performance of the flow simulated in a subsequent simulation increases.
Going back and forth between these two phases can however become tedious and small changes that cause large effects are likely to be overseen or hard to locate if one does multiple changes in between each simulation phase.
There is therefore a need for an improved flow design tool.