Enterprises often use different business applications built over time for analyzing business data. Such business applications may use data mining techniques including execution of a sequence of steps. The sequence of steps is referred to as a process flow associated with the business application. The sequence of steps may include reading data from multiple sources, followed by data preparation step including activities like applying filters or merging data from two different sources. Further, the sequence of steps may include an algorithm step followed by the data preparation step to process the prepared data. Furthermore, a user can visualize and analyze the output data by executing the process flow.
Typically, these steps in the process flow are performed using process components. For example, data source components retrieves the data from data sources, data preparation components performs merging activities, algorithm components process the data, data writer components store the processed data, and the like. Therefore, the process flow includes a chain of process components. This is sometimes called a pipe and filter architecture where the components are filters and the connections between are pipes. Further, each process component may include a standard cardinality, that is, a standard number of input and output ports. For example, a data source component may not include an input port as the data source component is a data reader. Also, a data preparation component may need at least two data source components as the input.
Hence, during designing of the process flow or in other words during construction of the process components chain, a user needs to connect the process components in a proper sequence. Currently, process components are connected manually by the user connecting the process component. Therefore, manually connecting the process components may be prone to errors, time consuming and also the user has to understand the cardinality of the process components. Therefore, a method to connect the process components by automatically detecting the connection compatibility between the process components would be desirable.