Predictive analytics is a tool for making and supporting decisions. Predictive analytics involves analysing historical data in order to predict future events and thereby automatically propose or take actions.
The majority of known predictive analytics systems are offline or batch processing systems that do not operate in real-time. The data used in the predictive analytics is separate from that used in operational systems and the data may be hours, days, weeks or even months old before analytics algorithms are applied to it. These techniques are not appropriate for applications in which it is necessary for the predictive analytics to be performed in real-time. Such applications may be, for example, the monitoring of an oil well drilling operation or an operation by a physician, in which it is necessary for problems to be detected, and proposals to be generated, very quickly.
A known technique for performing real-time predictive analytics is complex event processing, CEP. CEP systems generate alerts based on previously created rules for monitoring data. Such rule-based systems are inherently limited by the difficulty in defining and maintaining the rules. While a near real-time rule may be applied to data, the analytics required to create the rule is slow and not real-time. Moreover, the created rules are inflexible and incapable of adapting to changes in the data. The analysis needed to create and update rules is therefore undertaken offline. Accordingly, rule-based systems tend to only be used in stable and predictable environments in which it is possible to define a set of rules for all circumstances and for automatic actions to be taken.
Rule-based techniques are not appropriate for applying predictive analytics in fast-changing environments. Furthermore, there are scenarios in which it is not appropriate for automatic actions to be taken. If a critical or complicated decision is to be made, for example by an oil well operator during a drilling operation or by a physician during surgery, it is neither feasible nor desirable to take humans out of the decision making process.
Case-based reasoning, CBR, is a real-time predictive analytics technique that does not experience the above-described problems of rule-based techniques.
CBR systems detect and propose solutions to problems using information obtained from a plurality of cases stored in a case base. Each of the stored cases comprises a description of a problem and a description of a solution. The cases are typically generated manually based on actual experienced problems and devised solutions by system operators. Advantageously, CBR systems are able to provide system operators with detailed and reasoned solutions to complicated problems.
The application of predictive analytics to scenarios increasingly requires the use and handling of big data. Big data refers to a collection of data sets so large and complex that they become difficult to process using traditional data processing applications. For example, such big data could be encountered when applying predictive analytics within the financial services industry as a vast quantity of financial information is continuously generated and transferred between computing systems all over the world.
A problem with known CBR systems is that they are not designed for supporting and providing real-time operation on big data.