When an enterprise's mission critical Business Operations are highly dependent on its information systems, it is extremely important to operate the information systems with the highest levels of availability and performance to achieve optimal Business Operations. Business Operations commonly include the execution of an enterprise's business processes to achieve certain business goals. Operations of the information systems that support Business Operations are commonly referred to as IT Operations. In enterprises where Business Operations are highly dependent on IT Operations, Business Operations managers and IT Operations managers are constantly looking for Correlations between the performance of Business Operations and the performance of the IT Operations supporting the Business Operations.
Business Operations that are dependent on IT Operations are often modeled as multiple Layers of Operation, which include a Business Layer, an Applications Layer and an Infrastructure Layer. The Business Layer includes elements such as Business Services (e.g. an Internet Order Entry Business Service), Business Transactions (e.g. an Internet Order) and Business Traffic (e.g. a Checkout request). The Application Layer includes elements such as Application Services (e.g. the Internet Order Entry Website), and Application Components (e.g. Web page, Enterprise Java Bean). The Infrastructure Layer includes elements such as Databases, Hosts, Routers (e.g. Computer Server Host for the Internet Order Entry Website). Together these three layers provide the underpinning for IT Operations based Business Operations.
Issues in Business Operations are usually detected by taking direct measurements in the Business Layer, whereas, issues in IT Operations are usually detected by taking direct measurements in the Application and Infrastructure Layers. An example of an issue in Business Operations is: a lower than expected number of Internet orders on a given day, which can be detected by taking a direct measurement of the number of Business Transactions in the Business Layer. An example of an issue in the Application Layer is poor response time of the internet order entry website at any given time, which can be detected by taking a direct measurement of the response time of the Application Service modeling the website in the Application Layer. An example of an issue in the Infrastructure Layer is very high processor utilization of the computer for the Internet order entry website, which can be detected by taking a direct measurement of the processor utilization of a Host in the Infrastructure Layer.
Past attempts to find such Correlations in real-time between IT and Business Operations fall into two categories of approaches: bottom up Infrastructure Event Aggregation, and Business Activity Monitoring. Examples of bottom up Infrastructure Event Aggregation systems include HP Openview Service Level Navigator, Tivoli Business Systems Manager, Mercury Interactive Topaz Business Availability Cockpit, and Managed Objects Formula. Examples of Business Activity Monitoring systems include solutions offered by companies such as Indurasoft and Praja (now part of Tibco).
According to the bottom up Infrastructure Event Aggregation approach, critical availability and performance measurement Events that occur in the Infrastructure (network and systems technology), or in the Application Layer of IT Operations are collected and aggregated in a hierarchical sense to “derive” the impact on Business Operations. The problem with this approach is that since no direct measurements are performed in the Business Layer, this approach only offers a best guess as to the potential impact on Business Operations. Likewise, reporting on Service Level performance of Business Operations is “derived” from Service Level performance of Infrastructure and Application Layers, and is a best guess approach. Finally, there is no way to track top-down the Probable Cause of issues in Business Operations and Correlate such issues to the bottom up Events from the Infrastructure and Application Layers, because direct measurements in the Business Operations are not included in the first place.
In the Business Activity Monitoring approach, Business Operations are directly monitored and measurements are taken in the Business Layer in real-time. However no Events are Integrated, Normalized and Correlated from IT Operations in real-time, thus it is not feasible to Correlate between issues in Business Operations and IT Operations.
In addition to the above two approaches, there has been some generalized research work in the field of Complex Event Processing which does not specifically address the method and system for addressing Correlations between IT Operations and Business Operations. Instead, a general purpose theory of Event Aggregation is described in a way that is usable in many different types of methods and systems for event management.
Finally, there has been extensive work on Event Correlation which refers to the actual techniques used to perform correlations. The well known techniques used to perform Event Correlations are, for example, Rule-Based, Model-Based, Codebook-Based etc. These techniques do not specify the kinds of data being correlated, but the classes of algorithms used to perform Correlations. Techniques described herein focus on the domains and types of Events being Correlated and the importance of the Events in the Business Layer in the Correlations, regardless of the classes of algorithms used to perform the Correlations.