1. Field of the Invention
The present invention relates in general to the field of distributed and heterogeneous enterprise application environment, and in particular to a method and an arrangement for fault handling in a distributed information technology (IT) environment. Still more particularly, the present invention relates to a data processing program and a computer program product for fault handling in a distributed IT environment.
2. Description of the Related Art
In a typical distributed and heterogeneous enterprise application environment, as it is common in many large companies, application integration of different software systems is a necessity to automate common workflows and processes of the business, and the integration thus enables the companies to become more efficient and competitive in the market.
Companies can distinguish themselves from their competitors by being more agile and adapting faster to changing market trends and legal or industrial regulations (e.g. auditability). In order to achieve such agility on a technical level it is important to be able to rapidly deploy new automated workflows and processes or to change existing workflows and processes.
Enterprise process modeling and development environments, such as WebSphere Integration Developer by International Business Machines, allow integration developers to use graphical tools to model, develop, and deploy business process applications in standardized ways and formats (such as business process execution language (BPEL) or business process model and notation (BPMN)) and to leverage standardized protocols (like SOAP, JMS, HTTP, JCA, etc.) and proprietary connectors to integrate with third party systems of different kinds.
While these mentioned standards as well as proprietary connectors usually detail the syntactical interfaces to third party systems, they often lack semantic context, like meanings of error conditions and how to deal with the error conditions under the given circumstances. However, this semantic information is needed by the integration developer to properly develop interactions with a system and to appropriately handle fault conditions. Another problem is that syntactical interfaces of systems do not tell the integration developer how to deal with system responses, in particular, in case of fault responses. Without additional specific documentation, two or more developers might take different implementation approaches to perform the same fault handling. The results are non-streamlined and hard-to-read code, redundancy, differences in the procedure of fault handling in various parts of the integration solution, and difficulties in keeping track of changes in the fault handling procedure. Further, in an integration application, fault handling requirements are often derived from the particular backend application, rather than only the interface or the class and/or type of system. The derivation is based on (a) system-uptimes and/or system-downtimes which require buffering of service requests and retrying, (b) availability of compensation services on backend, internal or external system, and (c) transactional capabilities or limitations of a system, for example. Also in an integration application, a fault in one system may have implications for the interaction with other systems. This is true for compensation logic or transaction management over multiple systems, and logical association of systems to each other, for example. A fault in system “A” can be remedied by an operation of system “B”, whereas a fault in system “A′” must be corrected by an administrator, for example. Further, developers need to clarify semantics of fault handling for many systems with the respective subject matter experts or have detailed conventional documentation. This is very time-consuming in large appointment-driven companies as well as being error prone. Since fault handling logic is part of the modeling and/or development process and not a configuration task, a change or modification in the fault handling logic requires a change to the process model, too. Additionally dynamic binding of new versions of the fault handling logic is currently not supported without bringing down the mediation modules running in an ESB, due to a lack of an abstraction language describing the fault handler interfaces.
Generation of fault handlers based on interface definition like Web Service Description Language Definitions (WSDLs) has long been available in development tools. Resulting fault handlers usually consist of a fault handling skeleton that needs further implementing or a relatively generic fault handling procedure based on the fault type.
In the Patent Publication U.S. Pat. No. 6,421,740 B1 “DYNAMIC ERROR LOOKUP HANDLER HIERACHY” by LeCroy, a method for processing a first error message to produce a second error message in a component-based architecture is disclosed. The component-based architecture includes a framework which is associated with a first lookup handler and is capable of embedding a first component associated with a first executable unit for handling data of the first component. The method includes the step of generating a hierarchy of lookup handlers, the hierarchy including the first lookup handler and a second lookup handler associated with the first executable unit when the first component comes into focus. Further, the method includes the step of processing the first error message through the hierarchy of lookup handlers to generate the second error message. Through the hierarchy, the first error message is first processed through the second lookup handler. If the second lookup handler is unable to process the first error message, the first error message is then processed through the first lookup handler. In this manner, the second error message is more specific to the first component than the first error message. Basically a method of transformation or resolution of error information based on less specific error information is disclosed. The method does this by dynamically installing and/or embedding and/or uninstalling handlers in an application.
Basically this patent publication describes a transformation or resolution of error information based on less specific error information. The publication does this by dynamically installing/embedding/uninstalling handlers in an application. However, the publication does not describe a method to apply error handling based on the error information. Furthermore, the publication does not use rules or policies to determine a course of action for a given error situation. It also lacks an abstraction layer in support for dynamic binding of either new error handlers or new versions of existing error handlers.