Business Activity Monitoring (BAM) provides a wealth of information about how business processes are currently performing as well as how such business processes performed in the past. Given such present and past information, it will be appreciated that it also would be advantageous to predict future events, both for business processes and system components. Armed with a knowledge of what might happen in the future, users thus could take steps to correct or ameliorate problems before they arise, thereby reducing the likelihood of such a problem ever occurring in reality. Conversely, it also might be possible to anticipate advantages in a down period and plan accordingly. Furthermore, it might be possible to program a system to take an appropriate action without direct user interaction.
Previous attempts have been undertaken to try to predict future events based on past and present data. However, further improvements to these basic techniques are still possible. For example, it would be advantageous to increase the accuracy of predictions. It also would advantageous to reduce the time taken to learn about the factors that cause, or at least are correlated with, events of interest. Additionally, learning algorithms with the BAM field have not been implemented for use substantially in real-time. Further, even where learning algorithms have been implemented, they have tended to be time- and resource-intensive.
Thus, it will be appreciated that there is a need in the art for systems and/or methods that overcome one or more of these and/or other disadvantages. It also will be appreciated that there is a need in the art for systems and/or methods that process BAM-related data to predict when events of interest are about to happen and/or to identify the root causes of, or at least data correlated with, such events of interest.
An example aspect of certain example embodiments of this invention relates to techniques for processing BAM-related data to predict when events of interest are about to happen and/or to help identify the root causes of, or at least data correlated with, such events of interest.
Another example aspect of certain example embodiments relates to gathering and gardening key performance indicators (KPIs), e.g., based on the mean and standard deviations thereof (e.g., in connection with Z-factors thereof) over one or more predefined collection intervals.
Another example aspect of certain example embodiments relates to the application of a time-series transform to gathered and gardened data, e.g., to make the data more discrete and comparable, and/or to reduce the impact of the different collection intervals of the same and/or different data sources. Such a transform may perform a correlation in the frequency domain.
Another example aspect of certain example embodiments relates to matching the gardened data to one or more of a plurality of predefined waveforms, e.g., to reduce the amount of data to be processed.
Still another example aspect of certain example embodiments relates to the use of a dynamically generated Naïve Bayesian Network (NBN) capable of performing prediction.
Still another example aspect of certain example embodiments relates to analyzing the root causes of problems (e.g., based on a chi-square analysis).
In certain example embodiments of this invention, a method of analyzing data in a business processing management environment is provided. A plurality of performance indicators are identified, with each said performance indicator being related to a process and/or system component. At least one rule is defined to monitor at least one said performance indicator of a process and/or system component. A prediction template is created to identify at least one rule on which to create a prediction. The prediction template includes a past interval indicative of an amount of data to be analyzed before making the prediction, a future interval indicative of how far into the future the prediction is to be made, and an accuracy threshold indicative of a probability level to be reached in making the prediction. Data is gathered for each said performance indicator over a plurality of collection intervals. The gathered data is gardened to discard gathered data within a normal operating range for a given collection interval. A time-series transform is applied to the gardened data to normalize any variations in collection intervals. The transformed gardened data is fed into a dynamically updatable Naïve Bayesian Network (NBN) such that an entry is created for the transformed gardened data when the NBN does not include an entry for the transformed gardened data, and such that an already existing entry for the transformed gardened data is updated when the NBN includes an entry corresponding to the transformed gardened data. The prediction is made and an accuracy thereof is determined using probabilities computed by the NBN. A relevance value associated with each performance indicator in a rule is updated using the gardened data for root cause analysis. The gathering and the making of the prediction are performed substantially in real-time.
In certain example embodiments, a system of analyzing data in a business processing management environment is provided. A performance indicator storage location stores a plurality of performance indicators, with each said performance indicator being related to a process and/or system component. A rules table includes at least one rule for monitoring at least one said performance indicator of a process and/or system component. A prediction templates table includes at least one prediction template for identifying at least one rule on which to create a prediction, with each said prediction template including a past interval indicative of an amount of data to be analyzed before making the associated prediction, a future interval indicative of how far into the future the associated prediction is to be made, and an accuracy threshold indicative of a probability level to be reached in making the associated prediction. There is provided a connection to a data stream, with the data stream including data for each said performance indicator over a plurality of collection intervals. A gardening module is configured to garden the data in the data stream to discard data within a normal operating range for a given collection interval. A time-series transformation engine is configured to apply a time-series transform to the gardened data to normalize any variations in collection intervals. A dynamically updatable Naïve Bayesian Network (NBN) is configured to receive the transformed gardened data such that an entry is created for the transformed gardened data when the NBN does not include an entry for the transformed gardened data, and such that an already existing entry for the transformed gardened data is updated when the NBN includes an entry corresponding to the transformed gardened data. A prediction engine is configured to make the prediction and to determine an accuracy thereof using probabilities computed by the NBN, and further configured to update a relevance value associated with each performance indicator in a rule using the gardened data for root cause analysis. The gardening module and the prediction engine operate substantially in real-time.
In certain example embodiments, there is provided a prediction engine comprising programmed logic circuitry configured to make a prediction pertaining to a predefined event of interest and to determine an accuracy thereof using probabilities computed by a dynamically updatable Naïve Bayesian Network (NBN) populated with vector-quantized, time-series transformed, gardened performance indicator data relevant to the event of interest, and further configured to update a relevance value associated with each said performance indicator using the gardened data for root cause analysis. The prediction engine is further configured to operate substantially in real-time.
These aspects and example embodiments may be used separately and/or applied in various combinations to achieve yet further embodiments of this invention.