Today's IT infrastructures are composed of a large number of heterogeneous, distributed stateful IT resources (see also layer “IT Resources” in FIG. 1). That is, IT services typically comprise or are hosted by several heterogeneous IT resources (e.g. servers, operating systems on those servers, databases, application server software, etc.). The term IT Service Environment is used for a collection of resources and a set of behavioural conditions that all together define a specific IT service. IT services may be part of outsourcing agreements of IT services providers with customers, or they may be in-house applications.
For each of the resources used by IT Services several resource-specific management functions are available for controlling the operation of a resource, i.e. for creating (provisioning), destroying (de-provisioning) and controlling the operation and configuration of a stateful resource. Resource management functions of a resource may also control other resources—for example, a resource that acts as a resource manager may offer a service to create/provision a new instance of a certain other resource. The described view on the management capabilities of resources is depicted as “Systems Management Layer” in FIG. 1. This layer summarizes the management interfaces of the resources.
In order to perform systems management in the scope of a whole IT Service Environment (in contrast to single resources) an integration of single Systems Management Tasks into a systems-wide Systems Management Flow is necessary in order to treat the IT Service Environment as a whole and to keep it in a consistent state (see also layer “Systems Management Flows” in FIG. 1). Thus Systems Management Flows play a key role for the management of IT Services.
Another important aspect is to provide the data the distributed resources of the IT Service Environment acts upon at runtime in an appropriate way. So for example, a resource may require information of another resource to perform its system management task.
As an example for a Systems Management Flow, it is assumed that a new server WebServer3 has to be provisioned as depicted in FIG. 2. The related simplified Systems Management Flow could contain following Tasks (the following list first mentions the resources that perform the Tasks followed by a brief description of the Tasks):
Systems Management Flow 1:
Cluster1: task create server resource: create a new logical server and name it WebServer3; add a relationship from Cluster1 to WebServer3
WebServer3: task initial application start (including install and start image): obtain a new physical server from the server free pool; load and install a system image that contains an operating system and a web server application on WebServer3 from an image management system; reboot WebServer3
LoadBalancer: task register web server: add host name and IP-address of WebServer3 to the list of available web servers for load balancing
The data required by the resources performing the system management tasks have to be provided in an appropriate way. In the example discussed before, the WebServer3 resource needs information about the cluster and the resource pool it can use to obtain a new physical server. This information is provided by the Cluster1 resource. The LoadBalancer resource needs information about the host name and the IP address to be able to register the newly created web server. This information is provided by the WebServer3 resource. FIG. 2 shows the required data mappings in the discussed system management flow.
Initial Problem
Systems Management Flows can change when the underlying IT system changes. The more features, combinations of features, and different implementations of features a management solution offers, the more different Systems Management Flows must be defined and handled. There is a potentially large number of different Systems Management Flows required due to the quickly growing number of possible different IT Service Environment definitions. This can be compared to the assembly process in the automotive industry: the more options and accessories are offered for a car (e.g., engine type, transmission, colour, wheels, spoilers, etc.) the more complicated the assembly process gets and the more different car configurations can be produced.
Changes of the IT system landscape could require adapting a large part or even all of the Systems Management Flows. Because of the potentially large number and the complexity of Systems Management Flows, administration and maintenance of Systems Management Flow definitions is a demanding challenge.
Considering the sample system management flow depicted in FIG. 1 again, let us assume that the web server application should be installed by an individual task and that the image management system provides images containing the operating system only (these images could then also be used as a basis for the installation of other applications in different environments). For this purpose, an additional layer of operating system resources OS Server[n] is added to the resource topology and the task for the initial application start is decomposed into two tasks (see also FIG. 4). The adapted systems management flow would look like this:
Systems Management Flow 2:
Cluster1: task create server resource: create a new logical server and name it Server3; add a relationship from Cluster1 to OSServer3
OSServer3: task initial operating system start (including install and start OS image): obtain a new physical server from the server free pool; load and install a system image that contains an operating system only on OSServer3 from an image management system; reboot OSServer3; create a new logical server and name it WebServer3; add a relationship from OSServer3 to WebServer3
WebServer3: task initial application start (including install and start of the web server application): load and install a package that contains the web server application on WebServer3
LoadBalancer: task register web server: add host name and IP-address of WebServer3 to the list of available web servers for load balancing
If the targeted management solution has to support both of the discussed features (provide complete images as well as OS images and web server installation packages) it would be necessary to provide the two discussed systems management flows for provisioning new web servers.
The second system management flow would not only have to consider one additional operating system resources layer, but also the data distribution and data access would change. So in system management flow 1 the WebServer3 resource did only depend on the Cluster1 resource to get required information, such as cluster and resource pool information. The other information, such as IP address and credentials, resulted from the system management task to install and start a system image, which was performed by the cluster task itself. In contrast, in the second system management flow the installation of the operating system and the web server application are separated and hence performed by different resources. The OS installation is performed by the new OSServer3 resource and requires information about the cluster and the resource pool, which has to be provided by the Cluster1 resource. The Web Server installation is performed by the resource WebServer3, and the required information such as the IP address, operating system type and the operating system credentials has now been provided by the OSServer3 resource. FIG. 5 shows the changed data mappings in the system management flow 2.
Decomposing the Task to install a web server application also changes the relationships and the necessary knowledge of the participating resources among each other. In system management flow 2, the OSServer3 resource has to be aware that the WebServer3 resource depends on its input, and therefore has to make sure to provide the required data.
Hence supporting new features or changing existing ones results in changes of the data exchange among the involved system management tasks. Adapting the data exchange each time supported features change is error-prone, time consuming, and requires recompilation of the workflows in traditional workflow engines. A better approach would be to define data exchange only in a descriptive manner and let the system management system provide the data at runtime.
Prior Art
At the lowest level, System Management Flows are described by a set of machine-unreadable documents containing lists of System Management Tasks to be performed in a more or less specific way for the operator. This way of implementing Systems Management Flows is error-prone, time consuming, and requires administrators with a broad knowledge for interpreting the documents correctly and to run management tools and other programs with appropriate configuration parameters in the correct sequence.
At the next level, scripts and programs are used to support development and execution of System Management flows. While being very efficient this solution has the drawback that it does not use any standardized way for choreographing multiple tasks.
At a higher level, workflow techniques are used to define and run Systems Management Flows. Workflow techniques integrate several services into an overall flow in a clearly defined and consistent way. An established standard for describing workflows that integrates services having Web Services interfaces is the Business Process Execution Language (BPEL). BPEL descriptions of flows involving several single services can be executed by workflow processing engines (e.g. WebSphere Process Server). There also workflow engines / workflow systems supporting adaptive workflows. Adaptive workflows can be modified by modification operations during their instance life cycle in contrast to static workflow instances as provided by most of the workflow engines (especially, BPEL-engines).
Workflow engines/workflow systems provide data mapping techniques to obtain the required data in activities from preceding activities or the process input. Data mapping requires less development effort and is more flexible when changes in the process model are necessary. The data mapping logic (code) is typically isolated from the implementation of the activities to minimize its dependencies from changes in the mapping logic.
The IBM Tivoli Provisioning Manager (TPM) and the IBM Tivoli Intelligent Orchestrator (TIO) products both automated manual tasks of provisioning and configuring servers and virtual servers, operating systems, middleware, applications, storage and network devices acting as routers, switches, firewalls, and load balancers. Both products contain a deployment engine (DE) which acts as a proprietary workflow engine on entities (resources) which are part of a data center model. One of the features of the DE is the support of the hierarchical workflow/sub-workflow pattern in a very powerful and integrated way.
EP 1636743A1 describes another approach for integrating resource management functions into Systems Management Flows by a so-called resource catalog. The basic idea is that the resource catalog contains resources either being expandable node elements or being leaf elements of a resource tree. The node elements represent composed logical resources (for example, a load balanced web application) which are more and more refined towards the leaf elements which refer to concrete resources, like servers, firewalls, etc. The resource tree is built up in such a manner, that Systems Management Flows are derived from the sequence of the leaf nodes from left to right. The sequence of the leaf nodes and corresponding resource management actions from a resource management actions catalog are compiled to a service environment definition which basically defines a set of flows that can be executed on any workflow engine. The techniques that have been described here are also depicted in FIG. 6.
Residual Problems
The technologies mentioned above already allow for the integration of several systems management tasks into an overall Systems Management Flow. However, these technologies have some disadvantages.
All mentioned workflow based technologies provide some kind of definition language for specifying workflows. Systems Management Flows can be represented in such systems by corresponding workflow definitions. These definitions describe in which sequence activities are to be executed. So these descriptions are very sensitive to changes in the IT Infrastructure. As stated earlier, there is a potentially large number of different system management flows required. Changes of the IT system landscape could require adapting all or part of the Systems Management Flows. This means that the corresponding workflow definitions must be changed accordingly which results in a complex administration of workflow definitions.
The usage of workflow data mapping techniques does not avoid those activities which provide data for subsequent activities have to write this data by itself to the workflow output container. Thus, changes in the process model, such as integrating new activities which require data from preceding activities, result in additional development effort for these already existing activities (an activity could only avoid this dependency by always writing all its data to its output). A further problem is that changes of the data maps require a recompilation of the workflow. Therefore it would be much better to have the possibility to define the data exchange between the workflow activities in a descriptive way without the need to recompile the workflow and let the workflow engine take the required steps to supply the data at runtime.
Instead of storing Systems Management Flows directly as workflow definitions, it is desirable to derive Systems Management Flows from a more stable basis that is easier to handle when changes in the IT infrastructure or changes in the offerings of the management solution for the customer occur.
EP 1636743A1 proposes such a strategy, as described above. Changes of the IT infrastructure or changes in the offerings of the management solution for the customer can be handled by simply adding new resources to or modifying resources in the resource catalog and add or modify resource management action definitions. The drawback of the solution is that the changes do not become effective in existing, instantiated service environments. Only new subscriptions can benefit from the changes since the instantiation of the service environment for a subscription is the result of a compilation process that is performed only once at the beginning of a service environment life cycle. The mentioned compilation process includes the creation of the resource tree starting from a root node element, and the derivation of the resulting workflow and the parameter mappings from the resource tree.