1. Field
Embodiments of the invention relate to integration of data management operations into a workflow system.
2. Description of the Related Art
Workflow management systems (WFMSs) support the modeling and execution of business processes and may also be referred to as “workflow systems”. Business processes specify which piece of work of a network of pieces of work is carried out in which sequence and which resources are exploited to carry out the pieces of work. Individual pieces of work may be distributed across a multitude of different computer systems connected by some type of network.
A powerful and sophisticated workflow management system, such as the product IBM® MQSeries® Workflow (available from International Business Machines Corporation) supports the modeling of business processes as a network of activities. This network of activities, the process model, is constructed using a directed, acyclic, weighted, colored graph as a meta model. The nodes of the graph represent the activities, which define individual tasks that are to be carried out. Any other meta model, such as a hierarchical meta model, may be used for constructing process models. In general, each of the activities is associated with a piece of code that implements the appropriate task for that activity. The edges of the graph, the control links, describe a potential sequence of execution of the activities. Control links are represented as arrows; the head of the arrow describes the direction in which the flow of control is moving through the process.
The activity where the control link starts is called the source activity and the activity where the control link ends is called the target activity. An activity may be source and target activities for different control links. Activities that have no incoming control link are called start activities, as they start the process. Activities that have no outgoing control link are called end activities, as after their completion the process has ended. An activity may be a start activity as well as an end activity. An activity that has multiple outgoing control links is called a fork activity; an activity with multiple control links is called a join activity.
Different workflow languages are available, and Business Process Execution Language (BPEL) is one such workflow language. BPEL may be described as an XML-based language that allows task-sharing for a distributed computing environment using a combination of Web services. The term “BPEL” is sometimes also used to refer to other versions of the language, such as, Business Process Execution Language for Web Services (BPEL4WS) or BPELWS. BPEL may also be described as a standard for describing and choreographing business process activities. WebSphere® Business Integration (WBI) products (available from International Business Machines Corporation) provide an implementation for designing and executing BPEL based business processes. A component of WBI is named WebSphere® Process Choreographer (WPC) workflow system.
The term Web is used to refer to the World Wide Web, which may be described as a group of Internet servers that support documents formatted in HyperText Markup Language (HTML). A Web service (also referred to as an “application service”) may be described as a service made available from a Web server that is typically invoked by a program connected to the Web.
BPEL4WS defines a notation for specifying business process behavior based on Web services. (Business Process Execution Language for Web Services Specification, Version 1.1, dated May 5, 2003, hereinafter “BPEL4WS specification”) The BPEL4WS specification indicates that business processes may be described as executable business processes that model actual behavior of a participant in a business interaction or as business protocols that use process descriptions that specify mutually visible message exchange behavior of each of the parties involved in the protocol, without revealing their internal behavior.
According to the BPEL4WS specification, BPEL4WS provides a language for the formal specification of business processes and business interaction protocols and defines an interoperable integration model that facilitates the expansion of automated process integration in both the intra-corporate and the business-to-business spaces. Also, the BPEL4WS specification defines activities that are supported.
Access to a data management system from within a workflow system is unnecessarily complex in conventional workflow systems. Conventional workflow systems offer one or both of the following approaches:
1. A data management operation has to be coded in a traditional programming language, for example, by writing some Java® code and exploiting Application Programming Interfaces (APIs) offered by the data management system.
2. The data management operation has to be provided via a Web service.
The WPC workflow system, for example, supports both approaches. The WPC workflow system allows implementing process activities via Java® code and allows using Web services as process activities.
When using the first approach, a WPC user has to code a Java® activity. The user has to have Java® development knowledge in addition to knowledge about Structured Query Language (SQL) and the Java® Database Connectivity (JDBC) API that allows the user to issue SQL statements from within a Java® environment. The SQL statement itself fits into a single line of code. However, issuing the SQL statement (a type of “query”) via JDBC, getting an input parameter out of a WPC workflow system variable, feeding the input parameter into the query, and writing the result of the query into a WPC workflow system variable requires a large amount of code to be written and a large amount of time to perform the coding.
When using the second approach, a WPC workflow system user has to code the SQL statement and wrap the SQL statement into a Web service. Many steps and different products may be necessary to achieve this. Independent of how the user constructed the Web service, two additional activities (“assign” activities) may also need to be added to the business process to provide the Web service with the input needed and to deal with the results.
Both approaches are cumbersome to users, though accessing a data management system is a typical activity for a business process.
Moreover, workflows, such as those represented by BPEL, express a sequence of processing activities. Most commonly, these activities operate over a single record of data at a time. However, in an increasing number of scenarios, it is preferable to operate over sets of records (also referred to as “sets”), rather than one record at a time. This pattern is similar to an Extract Transform Load (ETL) pattern used in Data Warehousing applications to process data. With ETL, a set of records is retrieved (“extracted”) from a database, processed (“transformed”), and loaded into a database.
Thus, it is useful to use a workflow engine to intermix some set-oriented operations with the record-oriented operations. Most often, these set activities are really processing of information from one or more data sources (e.g., relational databases). For instance, an activity may retrieve or update information from one or more data sources as part of a database query. In conventional workflow systems, to perform a sequence of such activities, the workflow engine normally has to materialize the full set of data (i.e., retrieve all of the information from the one or more data sources) and then pass this set of data to another activity by copying the data, which performs its processing and passes the set of data to yet another activity, etc. Thus, the full set of data is copied and passed between data activities. Passing large sets by copy is not acceptable for performance reasons because it is in-efficient and is difficult to develop.
Conventional business process runtime systems have some weaknesses with respect to designing and executing data management operations. For example, conventional business process runtime systems assume a static environment, where the sets (e.g., tables) that the data management activities should go against are already known before running the business process (i.e., they are either specified at design or at deployment time).
Although for some applications, it is acceptable to assume a static data management environment, such an assumption may not be applicable for dynamic, on-demand application scenarios, where choosing the data of interest may depend on, for example, Service Level Agreements. In an on-demand application scenario, there may be data management environments in which data is stored redundantly in several places with different Quality of Service characteristics. The on-demand application may need to choose a different data source for different users, depending on the Service Level Agreements as contracted with the respective user. Choosing a data source may, for example, depend on access time, availability and/or staleness characteristics of the respective data sources.
Thus, many on demand applications would benefit from a workflow system that directly enables postponing the decisions about which data sources to use at runtime.
Therefore, there is a need in the art for integrating data management operations with a workflow system.